Help:Zoomable images

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Commons prefers images in the highest resolutions possible. Some websites on the Internet offer, in addition to low-resolution preview images, also zoomable high resolution images. Zooming is usually done using some small piece of client-side software (a Javascript or a Flash program that your browser downloads), which then loads the image and gives you the zooming capability. Because of this, the high-resolution images are not as easily accessible as a whole as the lower-resolution preview images. On this page, we give you a few hints how to get the high-resolution images all the same.

But beware: Only use the techniques presented here if you can demonstrate clearly that the image in question is in the public domain (i.e., that it is not copyrighted), or that it is freely licensed, and if you are sure not to break any local laws by doing so.

Zoomify[edit]

Zoomify is a program that offers zooming into high-resolution images. To reduce the network traffic and to improve response time, it doesn't download the full high-resolution image. Instead, the image is broken up into tiles: small rectangular areas, each small enough to be quickly loaded. From the zoom level and the visible area, the program calculates which tiles it needs to load to display the visible part of the image in a higher resolution, and then loads only those tiles. It follows that the full high-resolution image is accessible, but unfortunately only as a (possibly large) number of separate image files, one for each tile.

Dezoomify[edit]

Dezoomify (if defunct, try this mirror) is a web page in Javascript with the HTML5 canvas element. So you can on Firefox right-click on a canvas, and then choose Save As. The other advantages are that all the tiles are requested asynchronously to the server. This is much faster then the scripts that request the tiles one by one and you don't need to install anything. It now also supports the pictures from the national gallery, which use a system very similar to zoomify. The source is available on github.

Just enter the URL of a page containing a zoomify object (example) into the form.

Supported sites and image zooming tools[edit]

Cons[edit]

  • You can't choose the zoom level. The script downloads the images at the highest zoom level. However, you can choose the final image size.
  • As result you get a PNG file independently of the source format. For Uploading at Commons it's mostly useful to convert it into a JPG-file.
  • Requires browser tab to be active (tested: Firefox).

dezoomify-rs[edit]

dezoomify-rs comes as a standalone application, and allows downloading images that are larger than what dezoomify supports. It lets you choose the zoom level, and the format in which you want to save the output image.

Supported formats[edit]

  • zoomify
  • IIIF
  • Google Arts & Culture
  • krpano
  • IIPImage
  • custom (you can create a small file that describes the tiles layout, and the tool will fetch and assemble them)

dezoomify also comes with a browser extension that monitors network requests and automatically identifies key URL components for use with dezoomify.

Dezoomify.py[edit]

The Dezoomify Python script takes the URL of a page containing a Zoomify image, scrapes the necessary information, asynchronously downloads the image tiles at the maximum zoom level and losslessly (that is, without any re-compressing of the JPEG image and the resulting quality loss) stitches them together into a single image. Python 3 must be installed. Other included features:

  • batch mode for downloading several images,
  • optionally download a non-maximum zoom level of the image,
  • manually specify the Zoomify base directory, if necessary.

GitHub fork can be found here. An old version of the script, that uses Python 2 and Python Imaging Library instead of jpegtran, can be found here.

zoomify_dl[edit]

zoomify_dl is a perl script to download and assemble images that have been cut up to be viewed online using the "Zoomify" flash plugin. Then the image can be viewed (or manipulated) locally rather than tediously online (or not at all if the Zoomify viewer doesn't work in your browser).

zoomify_dl is relatively full featured in that it supports:

  • Downloading any of the saved zoom levels of the image.
  • Downloading only a section/crop of the image.
  • Finding Zoomified image paths in html pages including recursive page searches.
  • Displaying information about a zoomified image without downloading.

Zoomify.php[edit]

Loading all tiles at once[edit]

A tool exists to bypass zoomify and to load all tiles at once. This generates a single web page that shows the whole high-resolution image, although it is still composed of individual tiles. But since they are all properly arranged side by side, you won't see that. This tool resides on our Toolforge at the URL https://zoomable-images.toolforge.org/zoomify.php

It takes two parameters to be added in a certain manner at end of address (cf. following example):

  • zoom=[1-8] This defines the zoom level for which you want to get the tiles. The higher the number, the larger the final image will be. Try the tool with level 5, if that fails, lower the parameter.
  • path=URL This gives the path to the directory on the server where the tiles are stored. To get this path you can watch in the html of the page and looking for "zoomifyImagePath" or you look at your disk cache of your browser (URL: "about:cache" in Firefox). See also Monitoring HTTP requests below.

You can examine the sourcecode of the simple script: zoomify.php, zoomify-form.php.

  • The first step of the script is to look at: $path/ImageProperties.xml to get width and height of the image.
  • Then the script makes some calculations to generate a table with the images for the zoom step, and sends back a web page containing all these images. The script will not access the server for these images, It just sends your browser the generated web page that links to all the tiles at the zoomify server. Your browser will then load all the tiles from that server.

Example[edit]

A site that uses the zoomify program is dsloan.com. Click "zoom" on the first image on the right to see it. Note that the title bar of the image display says "https://www.dsloan.com/Auctions/A22/zoomer.php?file=zoomify/kendall-nebel-01". Evidently, the Flash program loads the tiles from https://www.dsloan.com/Auctions/A22/zoomify/kendall-nebel-01 To view that image as a whole in the highest available resolution, you would thus enter the following URL in your browser:

https://zoomable-images.toolforge.org/zoomify.php?zoom=5&path=http://www.dsloan.com/Auctions/A22/zoomify/kendall-nebel-01

The result is a page containing all the tiles in the proper layout. That page takes a while to load, so be patient. Note that the title of the page tells you what the maximum zoom level is, so if it indicates that even higher resolutions than zoom level 5 are available, try increasing the zoom parameter.

Now you can view the image as a whole, but it's still only individual tiles laid out. How can you save this display now as a single file? There are two ways:

  1. If you're using Firefox, you might consider using addons like Full Web Page Screenshots (formerly "FireShots") or Page Saver WE. With that, you can take a snapshot of the whole web page (even including those parts that are not visible in the browser window) at the screen resolution.
  2. Alternatively, you could save the whole web page locally, which would also save the tiles, and then assemble all the tiles in a graphics program. This is a tedious process to do manually when there are many tiles, but it avoids the detour through a screenshot, which may lose quality. If you're using the GIMP (version 2.4 or higher), there is a Scheme script to automatically assemble all the tiles for you: de-tile.scm. Copy this script into the "script" folder of the GIMP and refresh the scripts (or restart the GIMP). Then open the top-left tile in GIMP (the one named "5-0-0.jpg"). In the "Filters→Combine" sub-menu, you should have an entry "De-Tile". Select that and wait. The script will load all the other tiles and assemble them into one single image. When it's done, save the resulting image.
  3. Another program that is capable of creating a composite from tile images is XnView (Windows) or XnViewMP (Linux, MacOS, Windows), see xnview.com for both.

(Note: do not upload this example image. It already exists at the Commons as File:Nebel Mexican War 01 Battle of Palo Alto.jpg.)

dezoomify.rb script[edit]

Ruby + ImageMagick script that grabs and stitches Zoomify images: https://gist.github.com/59636

National Gallery collection[edit]

You can browse the collection of the National Gallery online via a panning/scrolling/zooming widget. However, looking at paintings through a tiny porthole (even with the "full screen" view) is limiting.

These tools lets you view paintings that are part of British and European cultural heritage on your own terms. Indeed, the aims of the Gallery itself support this view:

The Gallery aims to study and care for the collection, while encouraging the widest possible access to the pictures.

Browser tool
  • Dezoomify only requires a web browser. Firefox is preferred, because it can save the displayed pictures.
Other tools
  • zoom.sh: only Bash and ImageMagick required, very simple and almost self-contained.
  • natgal-dl: needs Ruby, Ruby progressbar library and ImageMagick. Does not work on Windows. It was developed by Paul Battley.[1]

Memorix Maior[edit]

Used in several Dutch cultural heritage databases.

Dememorixer is developed by User:1Veertje and should work for most of these databases.

Files that can potentially benefit from this tool have been sorted into Category:Dememorixer.

List of archives that use Memorix Maior[edit]

Archive about notes
Rijksbureau voor Kunsthistorische Documentatie Dutch art history
Beeldbank Amsterdam Amsterdam Download button on site give full resolution
Erfgoed Leiden Leiden
Stadsarchief Breda Breda Acquire permalink through "tweet this" button
Archief Zaanstad Zaanstad
Beeldbank defensie Dutch armed forces
Fries Scheepvaart Museum Frisian shipping
Rijnstreek en Lopikerwaard Rijnstreek and Lopikerwaard
Streekarchief Langstraat Heusden Altena Langstraat, Heusden and Altena
Historisch Centrum Leeuwarden Leeuwarden
Het Bewaren Waard.nl Netherlands
Heraldische Databank Dutch Heraldry
Musea in Drenthe Drenthe
Drents Archief Drenthe
Gemeentearchief Veenendaal Veenendaal
Waterlands Archief Waterland
Regionaal Archief Alkmaar Alkmaar
Beeldbank Zeeland Seaports Seaports of Zeeland
Beeldbank Rijswijk Rijswijk
Beeldbank Schiedam Schiedam
Gemeentearchief Tholen Tholen Download button on site seem to already give full resolution
Erfgoed Rijssen-Holten Rijssen-Holten Turn off flash to see thumbnail url, please use {{RijssenHolten}} for permalink
Royal Netherlands Institute of Southeast Asian and Caribbean Studies Southeast Asia and Caribbean *
Regionaal Archief Zutphen Zutphen
Regionaal Archief Noord-Holland Noord-Holland
Ga Het Na.nl Netherlands
Alle Groningers Birth, wedding and death certificates of people from Groningen
Alle Friezen Birth, wedding and death certificates of people from Friesland
Haagse Beeldbank The Hague
Gelderland in beeld Gelderland **
Beeldbank Groningen Groningen
Koninklijke Verzamelingen Dutch Royal Family

*: Permalink URL can only be found here by looking it up in the source code. Look for:

<meta property="og:url" content="http://example.com">

A thumbnail appears when Flash is turned off and the Dememorixer can generate a full image based on the URL to thumbnails; Looking up the permalink and including it in the meta data on Wikimedia Commons is apreciated though. This Greasemonkey script places the permalink in the comment box under the image.

**: The permalink can be tracked down by clicking on the "Tweet this" button. This Greasemonkey script changes the "share on Facebook" button into a link to the permalink.

On-demand generation[edit]

Some zoom utilities do not use tiling, but ask the server to generate them a crop of the original high-resolution image at the resolution, the size and the coordinates the user is viewing the image currently. In such cases, it may be a bit more difficult to get at the full high-resolution image, because the image file(s) themselves may be inaccessible from the Internet. Only the server-side program generating crops from the file can be accessed.

Such server-side programs need to take a few parameters to know what to return. Typically, these parameters are:

  • the desired resolution
  • the width and the height of the image that should be generated
  • the offset (x- and y-coordinates) within the full-resolution image from which the crop should be generated.

If the URL for the server-side program can be determined, and the parameters can be identified, it is then usually possible to manually query the server (using hand-written URLs) to see what its limits are. Some servers allow setting x and y to zero and the width and the height to arbitrarily large values, so that might be a way to get the full high-resolution image. Others place limits upon the maximum width and heights; in this case, one needs to get individual tiles and combine them as above.

Examples[edit]

Example 1: CONTENTdm sites[edit]

CONTENTdm is a digital collection management software that uses on-demand generation of zoomable images. A site using this software is the C. R. Savage Collection at the Brigham Young University. Click on the image to zoom in. But where does this image come from?

  • Click "View source" in your browser. Examine the HTML source. You'll find a "<form name="mainimage" action="">", containing an "<input type="image"" with the source URL
/cgi-bin/getimage.exe?CISOROOT=/Savage2&CISOPTR=1513&DMSCALE=8.04074&DMWIDTH=600&DMHEIGHT=600&DMX=0&DMY=0&DMTEXT=&REC=10&DMTHUMB=1&DMROTATE=0
  • Note the parameters for the scale, width, height, x, and y. There is also a "thumb" parameter set to 1.
  • The server is https://contentdm.lib.byu.edu/, as seen in the URL of the page.
  • Let's try this: enter the URL
https://contentdm.lib.byu.edu/cgi-bin/getimage.exe?CISOROOT=/Savage2&CISOPTR=1513&DMSCALE=100.0&DMWIDTH=3000&DMHEIGHT=3000&DMX=0&DMY=0&DMTEXT=&REC=10&DMTHUMB=0&DMROTATE=0
  • This defines a scale of 100%, width and height of 3000px, x and y as zero (i.e., from the top-left corner) and sets the "thumb" parameter to zero. Let's see what we get.
  • We're getting closer. It gives us a 766kB file of 3000×3000 pixels, but it isn't large enough yet to show the full original. Let's step up the dimensions to 8000×6000px. That file will likely be large (if we get anything at all), so be prepared to wait a bit if you try this.
  • Voilà, there we are: we've got the full picture as a 7462×6000px image (3.7MB). Save it, then crop it, and you're done. (Crop away at least the blue border. We neither need nor want that.)
  • We could have determined the (approximate) size of the full high-resolution version also from the start: the original zoom level was shown as 8%, and the thumbnail was 598×507px large. At 100%, the image would thus need to be about 7475×6338px...

(Again, please don't upload this example image. It already exists at the Commons as File:First Presidency and Twelve Apostles 1898.jpg.)

Example 2: David Rumsey Map Collection[edit]

Another well-known site using on-demand generation is the David Rumsey Map Collection. For this site, there are two ways to get full high-resolution images. We'll illustrate both techniques using the example image here (an old map of a part of Chile). Please don't upload this image, it already exists at the Commons as File:Chile.Pissis-A-rioloa.djvu.

The first technique gets you a high-resolution JPEG image, using the techniques shown in Example 1 above. If you examine the URLs for the images loaded by their viewer (see Monitoring HTTP requests below), you'll discover that it uses a URL like this:

rumseysid.lunaimaging.com/mrsid/bin/image_jpeg.pl?client=Rumsey&image=SIDS/D0052/0734001.sid&x=3212&y=2332&width=803&height=583&level=3

Again, note the parameters for x, y, width, height, and the zoom level. When you zoom in, you'll notice that the level parameter decreases, and is zero at the highest level. Some experimentation will quickly yield the following URL, which gives a full high-resolution JPEG image of the whole map:

http://rumseysid.lunaimaging.com/mrsid/bin/image_jpeg.pl?client=Rumsey&image=SIDS/D0052/0734001.sid&x=3000&y=2000&width=6000&height=5000&level=0

Right-click and choose "Save image as..." to save the file on your computer, then crop away the borders.

The second technique gives you the original MrSID file of the map. The Rumsey Collection includes in the left sidebar a direct link to this file. Unfortunately, this link doesn't work because it goes to a non-existing URL "http://www.davi". It used to work once, but either there's some error in their server-side software, or they've disabled these links intentionally. If that link works for you, fine. If not: the full link is still in the HTML source of the page. Open the HTML source in your browser (it should have a "View source" menu item somewhere) and scroll down. You'll discover that the actual URL given is "https://www.davi drumsey.com/rumsey/d ownload.pl?image=/D0 052/0734001.sid", i.e., it contains blanks. Copy this malformed URL into your browser's address bar with the extra blanks removed such that it reads https://www.davidrumsey.com/rumsey/download.pl?image=/D0052/0734001.sid and hit return. Your browser should now ask you where to save the file. Once you've got the MrSID file, you can convert it using e.g. IrfanView (slow) or using the tools provided by LizardTech, the company that developed the MrSID format. You cannot upload MrSID files to the Commons because it's a proprietary format and we allow only free file formats. You will have to convert the image to either JPEG or DjVu. Note that MrSID files use more advanced compression techniques (wavelet compression) than JPEG files, so converting a MrSID file into a JPEG may yield a huge JPEG file. The Commons has a maximum file size for uploads of 100MB, but JPEGs that large are really unwieldy and may be hard to handle. Try to keep the file size much lower, around a maximum of 10-20MB.

Example 3: Sotheby's[edit]

This portrait on auction at Sotheby's in 2008 is loaded in full if the user zooms in. Right-click-save is disabled but looking at the source of the page quickly reveals the full url to the image.

<img id="image-zoom" class="lot-image" data-src="/content/dam/stb/lots/AM1/AM1051/AM1051-40-lr-1.jpg" src="">

There are also browser extensions that re-enable the right-click context menu:

Monitoring HTTP requests[edit]

How can you find out through what URL an image is loaded? Unlike static linking, these dynamic zoom tools have the side effect that the URL of the image (or images) they load is hidden from the user. Sometimes, they are visible in the HTML source of the web page. If not, another approach to determine these URLs is to monitor the network traffic. All these client-side tools need to make HTTP requests to their server to get the images or the tiles.

  • Using a local proxy, it is possible to obtain a log of all requests made. Privoxy is a reliable freeware proxy that (amongst a lot of other useful features) also has a request log showing each request URL. Set up your browser to go through that proxy, and examine the request log to find the image URL.
  • Alternatively, there are tools to monitor a browser's traffic directly, such as the both Chrome and Firefox extension Network monitoring tool.
  • In Chrome or Firefox, hit F12 to go to developer tools. Go to "Resources" and brows through the folder "Frames" to see a list of all the images, with their URL and thumbnails.
  • In Firefox, you can also examine the URLs of loaded images through the "Tools→Page Info" menu, "Media" tab. Firefox lists all the images, with their URL and thumbnails.
  • Safari displays the URLs accessed from all open pages in its "Activity" window.

Notes for uploading[edit]

If one of the dimensions of the image exceeds 65,536 pixels (i.e., the maximum 16-bit integer), it will not be possible to create a JPEG. Creating a tile set is one workaround.

There is a file size limit of 100MB on re-uploads. If the image exceeds this size, then upload a new file.

References list[edit]