(Redirected from LOCKSS Daemon)
Jump to: navigation, search


[edit] Recent Releases

[edit] Daemon 1.40.2

  • Features
    • The daemon can now collect content from sites that use Akamai to cache content. The embedded source URL is extracted from Akamai URLs if org.lockss.UrlUtil.normalizeAkamaiUrl is true, so that the files are collected from and stored under the source URL. Additional work is needed on the link rewriter to allow such content to be easily browsable.
    • Individual AUs can be collected via different proxies by setting the AU config param crawl_proxy (e.g., in the title DB) to host:port. Set to DIRECT to cancel the effect of a global proxy.
    • Plugins can control whether more than one of their AUs may crawl simultaneously, by specifying how fetch rate limiters are shared between AUs. (AUs sharing a a rate limiter will not crawl at the same time.) By default all AUs belonging to a plugin share a rate limiter. Plugins may set plugin_fetch_rate_limiter_source to one of:
      • au - each AU gets its own rate limiter and multiple AUs may crawl simultaneously
      • plugin - all AUs belonging to the plugin share the same rate limiter and only one may crawl at a time
      • key:key - all AUs belonging to plugins that use the same key share a rate limiter
      • host:param - param should be one of the base URL AU config parameters of the plugin. The host part of the parameter value for the AU is extracted and used as the rate limiter key. (I.e., all AUs crawling from the same host will share a rate limiter.)
      • title_attr:attr - the value of the attribute attr in the AU's title DB entry is used as the rate limiter key
    • The crawler was previously hardwired to fetch no more than 10 files per minute, no matter how low a plugin set its au_def_pause_time. The minimum delay can now be changed by setting org.lockss.baseau.minFetchDelay (default 6000ms).
    • ListObjects servlet with arg type=files produces a tab-separated list of url, mime-type, size.
    • The maximum size of filtered streams recorded by HashCUS can be controlled by setting org.lockss.hashcus.truncateFilteredStream (default 100K). -1 means no limit.
    • Transmission speed of LCAP messages longer than org.lockss.scomm.minMeasuredMessageSize bytes (default 5MB) is reported in the log at debug level.
    • The ExplodedPlugin used by CLOCKSS boxes to ingest Elsevier and Springer source content has been restructured to make it definable and more like other plugins.
    • The Elsevier and Springer plugins for CLOCKSS now have initial support for metadata extraction, including DOIs.
    • Plugin jars generated by genplugin now include all .xml files in plugin dir, to allow for inheritance.
    • genkey accepts command line args to set certificate distinguished name (DN) values (from Monika @ MetaArchive).
  • Bug fixes
    • Added RandomManager to coordinate use of SecureRandom and ensure the desired algorithm is always used.
    • Unit tests seed SecureRandom to avoid exhausting kernel's entropy.
    • Exploder creates one AU per journal per year.

[edit] Daemon 1.39.2

  • Features
    • Plugins may now control the order in which URLs are fetched during a crawl. The load on servers that prepare and cache presentations for multiple pages (e.g., an issue or a multi-page article) in a batch may be significantly reduced by fetching all pages in a single article or issue in a depth-first fashion, rather than the default breadth-first. Plugins may supply a comparator to order URLs by setting plugin_crawl_url_comparator_factory to the name of a CrawlUrlComparatorFactory.
    • The user interface allows admin users to set arbitrary daemon configuration parameters on the Expert Config page. (For example, to tailor user account settings to local policies.)
    • In a network where each peer's identity is confirmed using SSL and cryptographic certificates, the poller may be configured to serve repairs to trusted peers without prior agreement, by setting org.lockss.poll.v3.repairAnyTrustedPeer to true.
    • ServeContent and the audit proxy can be configured to generate a browsable index of close-match AUs along with a 404 response for a non-preserved URL.
  • Bug fixes
    • Failed plugin resgistry crawls were retried too often when no regular AUs needed crawling.
    • Linux hostconfig script erroneously changed owner of /etc and /etc/lockss.

[edit] Daemon 1.38.4

  • Features
    • The daemon's administrative web user interface now supports:
      • SSL (https).
      • Multiple user accounts.
      • Current users status.
      • User-settable passwords.
      • Strict password quality and rotation requirements.
      • Finer-grained permissions.
      • Customizable logo displayed on each page.
      • Customizable login page banner.
      • Instructions for enabling these features are in beta test and will be posted soon. Contact us if you need them.
    • The daemon now includes a framework for extracting bibliographic metadata from the content being preserved and displaying it. The details of how this is accomplished are publisher-dependent, thus metadata is available only for those publishers whose plugins have been enhanced to support it. In this release the only plugins to have been enhanced are those for HighWire Press and BePress. The AU status page for AUs with these plugins will have links to generate:
      • A list of all the DOIs in the AU.
      • A tab-separated table of the URL for each article in the AU and its DOI.
    • Plugin Inheritance. If a plugin's plugin_parent attribute is set, the plugin's definition is the merge of the parent's and child's definitions, with attributes set in the child taking prededence.
    • Keystore management has been centralized, so multiple daemon components (e.g., LCAP SSL and admin UI) may share keystores.
    • Crawl-end report (and HashCUS) now report hash digest in hex (was base64).
    • Size of login page checker buffer is settable.
    • AU status displays existence and status of crawl window.
    • SSL startup script (/etc/lockss/runssl) is passed daemon release name arg (e.g., --release 1.38.4).
    • Added a framework for PluginUtil to display various attributes of a hypothetical AU.
    • Config parameter changes:
  • Bug fixes
    • Hashed byte-count statistics were kept in an int.
    • ServeContent failed to rewrite links in several cases.
    • Record of which peers don't have which AUs was being reset too often, causing peers to be invited needlesly.
    • Added missing log level mappings to syslog logger.

[edit] Daemon 1.37.2

  • Features
    • Highest agreement with consensus is reported, as well as most recent.
    • Agreement history may be transferred to a replacement PeerId. (E.g, when a peer changes IP address.)
    • Select box in daemon status pages is now usable from lynx and other browsers without javascript.
  • Bug fixes
    • Eliminated unnecessary hashing of older content versions when repair received
    • Proxy error messages include request hostname.
    • Files served by ServeContent are now cacheable.
    • HTTP servers (proxy, ServeContent, etc.) should now restart correctly when port is changed.
    • Crawler no longer double-fetches pages from sites requiring authentication.
    • Unknown host errors during crawl are reported correctly.
    • Crawl end report hashes unfiltered content.

Earlier Daemon Release Notes