richard's site| IT for archivists
Version 1.11.2 of siegfried is now available. Get it here. CHANGELOG v1.11.2 (2025-03-01) Filter introduced to improve Wikidata queries and -harvestWikidataSigLen flag sets minimum length of Wikidata signatures. Implemented by Ross Spencer and Andy Jackson -noprov flag introduced for Wikidata signatures. Implemented by Ross Spencer version command for roy. Implemented by Ross Spencer Wikidata definitions updated to 4.0.0. By Ross Spencer Logged error messages have more context. Implemented by...| IT for archivists
Version 1.11.1 of siegfried is now available. Get it here. CHANGELOG v1.11.1 (2024-06-28) WASM build. See pkg/wasm/README.md for more details. Feature sponsored by Archives New Zealand. Inspired by Andy Jackson -sym flag enables following symbolic links to files during scanning. Requested by Max Moser XDG_DATA_DIRS checked when determining siegfried home location. Requested by Michał Górny Windows 7 build on releases page (built with go 1.20). Requested by Aleksandr Sergeev update PRONOM to...| IT for archivists
Version 1.11.0 of siegfried is now available. Get it here. CHANGELOG v1.11.0 (2023-12-17) glob-matching for container signatures; see digital-preservation/pronom#10 sf -update requires less updating of siegfried; see #231 default location for siegfried HOME now follows XDG Base Directory Specification; see #216. Implemented by Bernhard Hampel-Waffenthal siegfried prints version before erroring with failed signature load; requested by Ross Spencer update PRONOM to v116 update LOC to 2023-12-14...| IT for archivists
Version 1.10.1 of siegfried is now available. Get it here. CHANGELOG v1.10.1 (2023-04-24) glob expansion now only on Windows & when no explicit path match. Implemented by Bernhard Hampel-Waffenthal compression algorithm for debian packages changed back to xz. Implemented by Paul Millar -multi droid setting returned empty results when priority lists contained self-references. See #218 CGO disabled for debian package and linux binaries. See #219| IT for archivists
Version 1.10.0 of siegfried is now available. Get it here. The major changes in this release are the inclusion of a format classification field in results, a “droid” multi setting for roy, and improvements to the multi-sequence matching algorithm. New format classification field in results A new “class” field now appears in results (for the YAML, JSON and CSV outputs). It contains values from the format classification field in the PRONOM database which groups formats into categories s...| IT for archivists
Version 1.9.6 of siegfried is now available. Get it here. CHANGELOG v1.9.6 (2022-11-06) update PRONOM to v109| IT for archivists
Version 1.9.5 of siegfried is now available. Get it here. CHANGELOG v1.9.5 (2022-09-12) roy inspect now takes a -droid flag to allow easier inspection of old or custom DROID files github action to update siegfried docker deployment [https://github.com/keeps/siegfried-docker]. Implemented by Keep Solutions update PRONOM to v108 update tika-mimetype signatures to v1.4.1 update LOC signatures to 2022-09-01 incorrect encoding of YAML strings containing line endings; #202. parse signatures with of...| IT for archivists
Version 1.9.4 of siegfried is now available. Get it here. CHANGELOG v1.9.4 (2022-07-18) new pkg/static and static builds. This allows direct use of sf API and self-contained binaries without needing separate signature files. update PRONOM to v106 inconsistent output for roy inspect priorities. Reported by Dave Clipsham| IT for archivists
Version 1.9.3 of siegfried is now available. Get it here. CHANGELOG v1.9.3 (2022-05-23) JS/WASM build support contributed by Andy Jackson wikidata signature added to -update. Contributed by Ross Spencer -nopronom flag added to roy inspect subcommand. Contributed by Ross Spencer update PRONOM to v104 update LOC signatures to 2022-05-09 update Wikidata to 2022-05-20 update tika-mimetypes signatures to v2.4.0 update freedesktop.org signatures to v2.2 invalid JSON output for fmt/1472 due to tab i...| IT for archivists
The aim of this page is to describe how your private information is used by this site. If I’ve missed anything glaring, or you have any questions, or you’ve inadvertently shared information on this site that you’d like taken down, please contact me. General the site is completely HTTPS - thanks Let’s Encrypt! the site is hosted by Google appengine &, when I refer to server-side processing below, that’s what I mean the site’s code is all open and published on Github.| IT for archivists
Version 1.9.2 of siegfried is now available. Get it here. CHANGELOG v1.9.2 (2022-02-07) Wikidata definition file specification has been updated and now includes endpoint (users will need to harvest Wikidata again) Custom Wikibase endpoint can now be specified for harvesting when paired with a custom SPARQL query and property mappings Wikidata identifier includes permalinks in results Wikidata revision history visible using roy inspect roy inspect returns format ID with name update PRONOM to v...| IT for archivists
Version 1.9.0 of siegfried is now available. Get it here. This release includes a new Wikidata identifier, implemented by Ross Spencer. CHANGELOG v1.9.0 (2020-09-22) a new Wikidata identifier, harvesting information from the Wikidata Query Service. Implemented by Ross Spencer. select which archive types (zip, tar, gzip, warc, or arc) are unpacked using the -zs flag (sf -zs tar,zip). Implemented by Ross Spencer. update LOC signatures to 2020-09-21 update tika-mimetypes signatures to v1.| IT for archivists
Version 1.8.0 of siegfried is now available. Get it here. This release includes changes in the byte matcher to improve performance, especially when scanning MP3s (fmt/134). CHANGELOG v1.8.0 (2020-01-22) utc flag returns file modified dates in UTC e.g. sf -utc FILE | DIR. Requested by Dragan Espenschied new cost and repetition flags to control segmentation when building signatures update PRONOM to v96 update LOC signatures to 2019-12-18 update tika-mimetypes signatures to v1.| IT for archivists
Version 1.7.13 of siegfried is now available. Get it here. This minor release fixes a in the namematcher that caused filenames containing “?” to be treated as URLs. It also adds the ability to scan directories using the sf -f command. Updates to the LOC FDD and tika-mimetypes signature files. Change Log v1.7.13 (2019-08-18) Added: the -f flag now scans directories, as well as files. Requested by Harry Moss Changed:| IT for archivists
Version 1.7.12 of siegfried is now available. Get it here. This minor release fixes a bug that caused .docx files with .doc extensions to panic and a bug with mime-info signatures. Updates to the PRONOM (v95), LOC FDD and tika-mimetypes signature files. Change Log v1.7.12 (2019-06-15) Changed: update PRONOM to v95 update LOC signatures to 2019-05-20 update tika-mimetypes signatures to v1.21 Fixed: .docx files with .doc extensions panic due to bug in division of hints in container matcher.| IT for archivists
Version 1.7.11 of siegfried is now available. Get it here. This minor release fixes the debian package and allows the container matcher to identify directory names (for SIARD matching). Updates to the LOC FDD and tika-mimetypes signature files. Change Log v1.7.11 (2019-02-16) Changed: update LOC signatures to 2019-01-06 update tika-mimetypes signatures to v1.20 Fixed: container matching can now match against directory names. Thanks Ross Spencer for reporting and for the sample SIARD signature...| IT for archivists
Version 1.7.10 of siegfried is now available. Get it here. This minor release fixes a regression in the LOC identifier introduced in 1.7.9 and updates to PRONOM v94. Changelog v1.7.10 (2018-09-19) Added: print configuration defaults with sf -version Changed: update PRONOM to v94 Fixed: LOC identifier fixed after regression in v1.7.9 remove skeleton-suite files triggering malware warnings by adding to .gitignore; reported by Dave Rice release built with Go version 11, which includes a fix for ...| IT for archivists
Version 1.7.9 of siegfried is now available. Get it here. According to the develop benchmarks, this release is slightly more accurate than v1.7.8, with only a marginal impact on performance. The highlights of this release are a new system for saving configurations for the sf tool, changes to the matching algorithm to improve accuracy, and simplifications to the basis field. Save and load frequently used configurations The new -setconf flag allows you to save frequently used configurations for...| IT for archivists
The next siegfried release will be out shortly. I have been busy making changes to address two thorny issues: verbose basis and missing results. I have some fixes in place but, when doing large scale testing against big sets of files, I noticed some performance and quality regressions. You can see these regressions on the new develop benchmarks page. There’s also a new benchmarks page to measure siegfried against comparable file format identification tools (at this stage just DROID).| IT for archivists
Version 1.7.8 of siegfried is now available. Get it here. This minor release updates the PRONOM signatures to v93 and the LOC signatures to 2017-09-28. As the only changes in this release are to signature files, you can just use sf -update if you’ve installed siegfried manually. This minor release is just for the convenience of users who have installed sf with package managers (i.e. debian or homebrew).| IT for archivists
Version 1.7.7 of siegfried is now available. Happy #IDPD17! Get it here. This minor release fixes bugs in the roy inspect command and in sf’s handling of large container files. A new sets file is included in this release, ‘pronom-extensions.json’, which creates sets for all extensions defined in PRONOM. You can use these new sets when building signatures e.g. roy build -limit @.tiff or when logging formats e.g. sf -log @.| IT for archivists
Version 1.7.6 of siegfried is now available. Get it here. This is a minor release that incorporates the latest PRONOM update (v92), introduces a “continue on error” flag (sf -coe) to force sf to keep going when it hits fatal file errors in directory walks, and restricts file scanning to regular files (in previous versions symlinks, devices, sockets etc. were scanned which caused fatal errors for some users). Thanks to Henk Vanstappen for the bug report that prompted this release.| IT for archivists
In my recent updates to this site I’ve added a new “Chart your results” tool on the siegfried page (in the right hand panel under “Try Siegfried”). This tool produces single page reports like this: /siegfried/results/ea1zaj. Before covering this tool in detail let’s recap some of the existing ways you can already analyse your results. Other ways of charting and analysing your results Command-line charting I appreciate that not everyone is a command-line junkie, but the way I inspe...| IT for archivists
Version 1.7.5 of siegfried is now available. Get it here. The headline feature of this release is new functionality for the sf -update command requested by Ross Spencer. You can now use the -update flag to download or update non-PRONOM signatures with a choice of LOC FDD, two flavours of MIMEInfo (Apache Tika’s MIMEInfo and freedesktop.org), and archivematica (latest PRONOM + archivematica extensions) signatures. There are two combo options as well: PRONOM/Tika/LOC and the Ross Spencer “d...| IT for archivists
richard's site| IT for archivists
richard's site| IT for archivists