ImageMagick v6 Examples --
Distorting Images

Index
ImageMagick Examples Preface and Index
Simple Distorts
Rotate and Shearing
Circular Distorts
General Distortion Techniques
Generalized Distortion Operator
Distortion LUT Maps
Distortion Animations (fun examples)
Affine Matrixes, and DIY Distortions (separate page)

Distortions are operators that warp or re-arrange the contents of the image. In ways similar to fun-house mirrors you too can distort images in extreme ways. The image itself remains intact, but the location of the various parts of the image could be wildly displaced on the canvas.


Simple Distorts

Flipping, and Mirroring

[IM Output] These simple distortions of the image doesn't change the actual images. They don't even change the rectangular nature of the image, just re-arrange things.

For these examples lets use this cute looking koala image...

The simplest image distortion is to rearrange the pixels in the image so as to "-flip" it upside-down.

  convert koala.gif  -flip  flip.gif
[IM Output]

Or by using "-flop" you can generate a mirror image.

  convert koala.gif  -flop  flop.gif
[IM Output]

Transpose and Transverse, Diagonally

The "-transpose" and "-transverse" image operations produce diagonal mirrors of the image.

The "-transpose" mirrors the image along the image top-left to bottom-right diagonal.

  convert koala.gif  -transpose  transpose.gif
[IM Output]

While "-transverse" mirrors the image along the image bottom-left to top-right diagonal.

  convert koala.gif  -transverse  transverse.gif
[IM Output]

Rectangular Rotates

All the above operations, will essentially produce a mirror image of the original. The "-rotate" operator provides the non-mirrored versions of the image.


  convert koala.gif  -rotate   0  rotate_0.gif
  convert koala.gif  -rotate  90  rotate_90.gif
  convert koala.gif  -rotate 180  rotate_180.gif
  convert koala.gif  -rotate -90  rotate-90.gif
  convert koala.gif  -rotate 360  rotate_360.gif
[IM Output] [IM Output] [IM Output] [IM Output] [IM Output]

Note that "-rotate" is a simple distort only if you use a rotation angle of a multiple of 90 degrees. Any other angle will introduce other more complex pixel level distortions into the image.

You may notice that a positive angle of rotation is clock-wise, which seems to be mathematically incorrect. Internally however, it is mathematically correct and is caused by use a of negated Y-axis. That is the Y-axis goes from 0 at the top and positive downward. Because of this the coordinate system in reversed, and thus rotation is also reversed mathematically.

Digital photos can also be rotated to match the recorded Camera Orientation by using the "-auto-orient" operator. This was added in IM v6.2.7-8.

Rolling Images like a bad TV

You can "-roll" an image horizontally (like a TV that is out of sync). The amount of the roll is given in pixels.

  convert koala.gif  -roll +0+20  roll_horiz.gif
[IM Output]

Of course you can also roll the image sideways...

  convert koala.gif  -roll +30+0  roll_vert.gif
[IM Output]

Or by using a negative number of pixels, you can roll it in the opposite direction.

  convert koala.gif  -roll +0-20  roll-horiz.gif
[IM Output]

Simple Distorts Summary

The most important aspect of all these operators is that you can add them all together in many different ways such that the result will be exactly as if no operation was performed at all.

  convert koala.gif -roll +25+0 -rotate 90  -flop \
          -roll +0-25  -flip  -rotate 90    original.gif
[IM Output]


Rotating and Shearing

While all the above image distorting operators preserve the basic rectangular format of the image, the next set does not. The results of these operators do not fit in the original rectangle (or even the rotated rectangle). To ensure the result contains the whole of the original image, the size of the image is enlarged to accommodate.

Rotating Images -- Simple Image Rotation

As you saw above the "-rotate" operator can perform simple, image preserving distorts, when you rotate image in units of 90 degrees.

With other angles however, the rotated image will not fit nicely into a rectangular image. Consequently to ensure that no image data is lost, the size of the final image is enlarged just enough to accommodate the rotated image.

  convert koala.gif -rotate 30 rotate.jpg
[IM Output]

Note that the direction of rotate is clock-wise. This may seem illogical mathematically, until you realise that the image coordinate system is relative to the top-left of the image instead of the mathematical norm of the bottom-left. The result is the angle of rotation is also inverted. This is important to keep in mind when dealing with any form of image rotation, vs a diagrammatic or mathematical rotation.

The extra space added by ImageMagick is colored with the current "-background" color setting. Allowing you to specify the color to fill into the corners.

  convert koala.gif -background lightblue -rotate 30 rotate_color.png
[IM Output]

Of course if you want to fill with the transparent color, you will need to ensure the image can handle transparency (by using a "-matte" operator to add an alpha channel), and is saved to an image format that can handle transparency.

  convert koala.gif -matte -background none -rotate 30 rotate_trans.png
[IM Output]

If the extra space comes out black, then your output image output format does not allow the use of an alpha channel, (most likely the JPEG format), so the transparency defaults to black.

Before version 6.1.2, "-rotate" did not handle transparency correctly, producing stripes of black and transparent in the corners of the rotated image. The workaround for this problem was rather complex, involving rotating the alpha channel separately to the colors.

But what if you don't what that extra space, wanting to preserve the images original size? Well you can use a centered "-crop" to return the image to its original size.

If you don't know what the original size was, you can use an alpha composition trick (see the 'Src' Compose Method) to restore the image back to its original size.

    convert koala.gif -matte \( +clone -background none -rotate 30 \) \
            -gravity center  -compose Src -composite   rotate_resized.png
[IM Output]

The "-rotate" operator also understands two extra flags. If a '>' symbol is added to the rotate argument (before or after the number), then the image will only be rotated if the image is wider than it is taller. The result is that this flag will only rotate 'landscape' (wide) style images into 'portrait' (tall) style images.

The other flag '<' does the opposite, only rotate images that are taller than it is wide.

Digital photos can also be rotated to match the camera orientation (based on the "EXIF:Orientation" meta-data) by using the "-auto-orient" operator. However remember saving back to JPEG format may not be a good idea.

The Rotate Operator is actually implemented using various Simple Distorts and Image Shears, a technique known as 'Rotate by Shear' (RBS) and first publish in research papers by Alan Paeth.

Because non-simple rotates are implemented as a sequence of Shears there are some subsequent effects that result.

First areas added into the image corners are just directly filled in using the "-background" color. You can not control this using Virtual Pixel Setting.

Also you can not control the mixing of colors during the Image Rotation using the Interpolation Setting. Which means you can not preserve the original colors of the image.

Rotating a Thin Line - rotation color blurring

For example here I rotated a simple vertical line of pixels by 17 degrees, which I then scaled so as to show the effect of the individual pixels in the resulting image.

  convert -size 10x30 xc: -draw 'line 5,0 5,30' \
          -background white   -rotate 17  -scale 500%  rotate_magnify.gif

Note how the line seems to phase in and out in the rotated image, as it crosses pixel boundaries. Not only that, but as rotate is implements as multiple shears (see note above), the line only phased in sharply every second pixel column. That is the result is actually more blury than it needs to be.

[IM Output]

This 'rotation blur' is especially noticable when you rotate small text and labels, either attached to or part of a photo, by very small angles. Such lines contain lots of fine detail that 'dissapeares' in a regular and highly notice wave of blurring across the image.

The only direct solution to this problem is to use a Pixel Mapping technique, such as an Affine Distortion or Scale-Rotate-Translate (SRT). For examples of this see Interpolation of a Rotated Line.

However a much easer, though indirect method is to use the Super Sampling technique (see below). Basically we rotate an image that is at least twice (or more) the size of the final image size wanted. After rotating the resize will sharpen the resulting rotated image producing very sharp and clean lines, edges, and fonts.

An example of this solution is given both in the Super Sampling introducion below, and also in a more practical way, when using the Polaroid Transform, which is especially vulnerable to small angle rotation blur, of the added text caption.

For a more deeper understanding of the various image rotation algorithms and the issues involved see Leptonica Rotation.

Shearing Images -- Linear displacement

The "-shear" operator takes each row (or column) of pixels and slides them along so that each row (or column) is displaced by the same amount relative to the neighbouring row (or column). Its two arguments are given in terms of angles.

Just as with "-rotate" the operation increases the size of the resulting image so as not to loose any information.

However shear is more complex in that it is really a double operation.

  convert koala.gif  -shear 20      shear_rot.gif
  convert koala.gif  -shear 20x0    shear_x.gif
  convert koala.gif  -shear 0x50    shear_y.gif
  convert koala.gif  -shear 20x50   shear_xy.gif
  convert koala.gif  -shear 20x0  -shear 0x50   shear_xy2.gif
  convert koala.gif  -shear 0x50  -shear 20x0   shear_yx.gif
[IM Output] [IM Output] [IM Output] [IM Output] [IM Output] [IM Output]

If you look at the results you will see that the double form of the "-shear" operation is to do the X shear first, followed by the Y shear (with an appropriate final image size correction). This is fundamentally different to a Y shear followed by the X shear.

If only one number is provided (without any 'x') then "-shear" will apply it in both the X and Y directions as a sort of poor mans rotate. For this reason the units of shear are in degrees.

The "-background" color setting is of course used as the color for the extra space added.

  convert koala.gif  -background none  -shear 30  shear_trans.png
[IM Output]

Before IM version 6.1.2 "-shear" did not handle transparency. The workaround for for this problem was rather complex, involving shearing the alpha channel separately to the colors.

Note that using a "-shear" in this way a proper method of rotating an image.

To actually use shear to rotate an image properly, you would need to use a need to perform three shearing operations in the form of "-shear {X}x{Y} -shear {X}x0 -crop ... ", however working out the proper values for the '{X}', '{Y}' and final crop requires some trigonometry. The builtin "Rotate Operator" (see above) is actually implemented in this way.

Note that shearing in the X direction will not affect an images height, while shearing the the Y direction will not effect the images width. The result is that the area covered by some object within the image will not change (only the surrounding container holding the image).

The Shear Operator is implemented as a direct 'warping' of the source image. As a consequence it does not use the Interpolation Setting or the the Virtual Pixel Setting.

As a result the areas added to the image is only filled by the current "-background" color, and no means are provided to preserve the original colors of the image.

Waving Images - Sine Wave Displacement

The "-wave" operator is like "-shear" in that it adds a 'linear displacement' to images. However this operator will only displace columns of pixels vertically according to a sine wave function.

There are two arguments to the "-wave" operator. The first is the maximum height or amplitude the pixels will be displace either up or down, while the second is the wavelength of the sine function in pixels.

  convert koala.gif  -wave 10x64 wave.jpg
[IM Output]

Note that because pixels can be displaced up to the given amplitude that much extra space will always be added to both the top and bottom of the image, even if that space is not actually needed.

For example by adjusting the arguments so that the wavelength is double the width of the image, you can make the image into a arc.

  convert koala.gif  -wave 20x150 arched.jpg
[IM Output]

In this sort of case the unused space can be removed using either a "-chop", "-shave", or possibly even a "-trim" operation.

Lets clean up the previous example by using a negative amplitude to flip the arc over, and use "-chop" to remove the unused space the "-wave" operator added.

  convert koala.gif  -wave -20x150 -gravity South -chop 0x20 arched_2.jpg
[IM Output]

Of course the "-background" color setting can be used to define the extra space added to the image.

  convert koala.gif -matte -background none -wave 10x75 wave_trans.png
[IM Output]

As you can see from the above examples "-wave" only applies in the vertical or 'Y' direction. If you want to add a wave in the X direction, you'll need to rotate the image before and after you apply the wave.

  convert koala.gif  -rotate -90 -wave -10x75 -rotate 90 wave_y.jpg
[IM Output]

The technique can be used to add a wave pattern or vibration to an image at any angle. Examples of this is given in the Vibrato Font and in the Smoking Font.

One other limitation with "-wave", is that the wave only starts at zero. That is the left most column is not displaced, while the next few rows are displaced downward (positive X direction), unless you give a negative amplitude for an initial vertical offset.

Basically the "-wave" operator does not (at this time) allow you to specify an offset for the start of the sine function. This can be rectified however by adding, then removing, an image offset using "-splice".

  convert koala.gif  -splice 19x0+0+0 -wave 10x75 \
                     -chop   19x0+0+0     wave_offset.jpg
[IM Output]

While "-wave" will not make use of the current Virtual Pixel Setting to define the color of the added areas, it will look at the current Interpolation Setting to map the colors from the source to the image generated. This means wave will tend to blur pixels slightly in vertical bands across the image...
EXAMPLE: wave a image of horizontal lines.

However you can use a special Super Sampling technique to improve the blurring of the results.

EXAMPLE: supersampled wave image


Circular Distortions

So far the image distortions have been rather mild, with very little stretching, expanding or compressing of the image data. That is the data remains pretty well unchanged.

These next few image operators can result in a image that is so distorted, the original image can not be determined. The colors are twisted into a blurry mess.

It also happens that they limit the distorting effects to a circular area with little to no distortion of the original image at the edge of the image rectangle. That means you can use these operators on a small area or Image Region and the result will still blend into the original image without it looking like it was: cut out, warped and pasted back into place.

That is they are limited 'local' distortions.

Imploding Images

The "-implode" operator warps the image so as to pull all the pixels away from the edge toward the center. Its sort of like sticking a vacuum, or 'black hole' in the center of the image and sucking the pixels toward it.

Caution however is advised to only use very small values, to start with.

  convert koala.gif -implode .6 implode.gif
[IM Output]

Using increasingly large values will essentially suck all the pixels in the circle, into oblivion.

  convert koala.gif -implode 5 implode_big.gif
[IM Output]

However be warned that using any "-implode" value larger than '1.0' is also effected by the Virtual Pixel Setting, as the algorithm starts to make color references beyond the boundaries of actual image itself. As the default "-virtual-pixel" setting is 'edge', the edge color or surrounding frame on an image can have a major effect on the result.

For example these two images are the same except one had white border added to it. this basically shows the area which is using colors looked up from beyond the bounds of the image proper. The area normally defined by the "-virtual-pixel" setting.

  convert rose: -gravity center -crop 46x46+0+0 +repage \
                                              -implode 3   implode_rose.gif
  convert rose: -gravity center -crop 44x44+0+0 +repage \
                -bordercolor white -border 1  -implode 3   implode_rose_2.gif
[IM Output] [IM Output]

Other "-virtual-pixel" settings can result in highly distorted copies of the image appearing withing the imploded region, here for example I implode simple box image with a 'tile' setting.

  convert -size 94x94 xc:red -bordercolor white -border 3 \
          -virtual-pixel tile     -implode 4   implode_tiled_box.gif
[IM Output]

More "-virtual-pixel" effects are explored on Implosion Effects of Virtual Pixels.

As the number of pixels being imploded into a small area increases, the results start to get a 'pixelated' look. To get a better more consistent result, you can increase the number of pixels implode works with, using a technique called Super-Sampling.

Basically by using a larger image (enlarging the source image if necessary), doing the distortion, then shrinking the result to its final size you will produce a much better result.

  convert -size 94x94 xc:red -bordercolor white -border 3 \
          -virtual-pixel tile  -resize 400%  -implode 4 -resize 25% \
          implode_tiled_ss.gif
[IM Output]

As you can see you get a much smoother and more realistic result that shows the internal detail of the distortion much better. We will look at this technique a lot more later.

By using a larger "-border" around the image being imploded, and later removing it again, you can also warp the edges of an image.

  convert koala.gif -bordercolor blue -border 20x20 \
          -implode .5   -shave 18x18  implode_border.jpg
[IM Output]

As of IM version 6.2.1 you can also use a transparent border, or image with transparency...

  convert koala.gif -bordercolor none -border 20x20 \
          -implode .5   -shave 18x18  implode_border_trans.png
[IM Output]

Exploding Images

By using a negative value with the "-implode" operator, you can explode the image. This is however more like magnifying the center of the image pushing all the mid-radius pixels out toward the edge, rather than a true explosion.

  convert koala.gif -implode -2 explode.jpg
[IM Output]

Using larger value will essentially enlarge the center most pixels of the image into a circle two-thirds the size of the smallest image dimension.

  convert koala.gif -implode -30 explode_big.jpg
[IM Output]

And here is a 'Super-Sampled', version.

  convert koala.gif -resize 400% -implode -30 \
          -resize 25% explode_big_ss.jpg
[IM Output]

The central color of the internal 'explosion' is set by the color of the center of the image (or region). This means that by changing the colors around that point, before exploding your image, you can control the 'flash' effect of the explosion. See Distortion Animations for an animated example of this color control.

Swirling Image Whirlpools

The "-swirl" operator acts like a cake mixer. It will warp the image around in a circle the number of degrees you give it as an argument.

  convert koala.gif -swirl 180 swirl.jpg
[IM Output]

By adding a border, and combining with "-implode" you can give the look of a whirlpool sucking the image up to oblivion.

  convert koala.gif -bordercolor white -border 20x20 \
          -swirl 180 -implode .3  -shave 20x20   whirlpool.jpg
[IM Output]

I have animated these swirling effects, which you can see below in Distortion Animations.

Limiting the Distort with Regions

All of the distortion operators work on a full image. However some of the distortion methods allow you to distort a smaller area or "-region".

For example here we have a line of stripes.

  convert -size 600x70 xc:darkred \
          -fill white -draw 'roundrectangle 5,5  595,65 5,5' \
          -fill black -draw 'rectangle 5,25 595,31' \
          -fill red -draw 'rectangle 5,39 595,45' \
          lines.gif
[IM Output]

Now by defining regions we can distort the line in various ways in specific places.

  convert lines.gif \
          -region 90x70+10+0    -swirl  400  \
          -region 90x70+100+0   -swirl  400 \
          -region 90x70+190+0   -swirl -400 \
          -region 120x70+280+0  -implode 1.5 \
          -region 100x70+380+0  -implode -7  \
          -region 101x90+480-10 -wave 10x50 \
          lines_regions.gif
[IM Output]

Note that the two Circular Distortion operators "-implode" and "-swirl", fit into the use of regions very well, as they have the property that the outside edge of the distorted image matches up to the rest of the image outside the defined region.

Also note that we could even use a Wave Distortion in the above, but that to do so, I had to enlarge its defined "-region" of operation. This is very tricky, and only works in the special case when the region covers the whole height of the image being processed.

How Regions Work

In reality the way regions work is they basically extract area of the image that the "-region" covers. This image is then run through the distortion operator, and then it is Composed back on top of the original image.

This actually works for any image operation (not just distorts).

For example here I define a region on the koala image, and rotate it, with a 'blue' background 'corner fill'...

  convert koala.gif  -region 30x30+10+10 -background blue -rotate 30 \
          koala_region_1.gif
[IM Output]

Even though this operation was not very useful in terms of results, you will notice that only the defined region was modified.

The other thing that you should notice, is that as the resulting image was enlarged by the rotation the enlarged image was still overlaid at the original coordinates, overflowing the areas from which the image was extracted.

The point this is that while regions can be used for interesting effects, or for limit the effect of operators, it will break down in its intended use if the size of the extracted image is changed. That is you need to be careful in what operations you apply to a extracted region.

Now what if instead of using a 'blue' corner fill, we instead use a transparent fill.

  convert koala.gif  -region 30x30+10+10 -matte \
          -background none -rotate 30   koala_region_2.gif
[IM Output]

This shows that when a region is overlaid back onto the original image, anything that is transparent will show the original image. That is the extracted region is only a copy, and does not erase the original area.

Finally you can do multiple operations on a region. For example...

  convert koala.gif  -region 30x30+10+10 -matte \
          -background blue -rotate 30 -background red -rotate 30 \
          koala_region_3.gif
[IM Output]

Note that the second 'red' rotation was applied to the previously 'blue' rotated reagion extracted image. That is the modified region was not restored to the original image until after both rotations had been applied.

In other words, the "-region" operator extracts an area of image to perform operations in, and will only restore that area (using Composite Over) when either the end of the command is reached, or another "-region" is requested.


General Distortion Techniques

Now that we have been introduce to the simple distortion operators that IM provides, lets take a step back and look at the nitty-gritty, and see how image distortions actually work, and how you can improve the way you use them.

Later we'll go forward to much more complex ways of distortion images, including methods that are not directly built into ImageMagick.

There are only a few basic ways an image processor can distort images.

The Simple Distortion operators for example are achieved by Pixel Swapping. That is individual pixels, or even whole rows and columns of pixels are just swapped around to Flip, Roll, Transpose, and even do Rectangular Rotates of images. No color changes are made, and the number of pixels remains the same.

The next method of distorting images is to Shifting or Shear the pixels either horizontally or vertically, such as what IM does with Image Shearing and the Wave Distortion above. The shears in turn providing a method to Rotate Images by any given angle, in a manner that should be quite fast.

However pixel shifting methods are limited to those basic distortions. It can not scale an image to a different size for example. You also have very little control over the handling of new areas of the resulting image that is not covered by the source image. In the above mentioned functions IM just sets the missing areas to the current background color.

To be able to distort images in a much more general way you need to use a more general distortion technique known as Reverse Pixel Mapping. For example this method is used by the more complex Circular Distortions such as Imploding and Swirling images.

Reverse Pixel Mapping

If we try to use a direct mapping each pixel from the source image to destination image, one pixel at a time, as is done for Simple Distorts, we get lots of problems. A specific pixel may map to a location that another pixel has already colored (overlaps), while a destination position may not even get any color from the source image (holes).

The reason is that pixels require an integer location in both the source and destination. But a general transformation generally produces a real, or floating point value. That is a direct mapping of pixels from an old location to a new position just does not work.

Reverse Pixel Mapping does the opposite. For each and every pixel in the destination image, the color needed for that pixel is looked up from the pixel array in the source image. As each and every pixel is process, we can be sure that every pixel in the destination gets one and only one color. So as long as we can figure out the 'source' location for each destination pixel (real or virtual), we can distort a source image to the destination image using any mathematical formula you can imagine.

[photo]

In Summary, a distortion mapping does the following.
For each pixel (I,J) in the destination or output image
   Map the I,J pixel position to a U,V pixel position in the original image
   Look up the Color of the original image at position U,V
       Using color interpolation, work out the appropriate color.
       Or the virtual-pixel setting, if it misses the actual source image.
   Set the destination images color for pixel I,J

Note that I used the variable names 'i,j' and 'u,v' in the above as these variables map into the variables that you would use in the "-fx" or DIY Operator.

The distinction between forward and reverse mapping is important as most mathematical transformations are defined as forward mappings, mapping a single source (U,V) position to a destination (I,J) position. And indeed a 'forward mapping' works well for vector graphics, and drawing lines where you can just map the ends of the line and draw it. This is especially true for any linear transformation, such as rotations, where lines remain straight. It is in fact what is done for vector based languages such as such as postscript and SVG.

But for a general raster image, you must use a reverse mapping to distort the image, to ensure you 'fill in' all the pixels of the destination image. The problem is not all forward mapping transforms, work as a reversed transform. On the other hand some image distortions work very well when used as a reverse mapping, allowing you to produce interesting effects.

You can see the above in more detail in a DIY explanation in the sub-page DIY Distortion Mapping, as well later in the form of a more generalised technique of using Distortion LUT Maps.

What's in a name?

During my study I found that there is no real clear naming of this image processing method. The actual algorithmic process is known a 'Reverse Pixel Mapping', while the use of mathematical equations is known as a 'Geometric Transformation'. If the distortion is controlled by the movement of various control points, it is known a 'Image Warping' or 'Rubber Sheeting'. The process of defining specific points, usally to match you two or more images is known as 'Image Registration'.

Images can also be subdivided into smaller simpler units which are individually distorted using a technique called 'Gridding' (quadrilaterals) and 'Triangular Meshing' (triangles). By using small incremental distortions with blending of a start and end image you generate 'Image Morphing' such as you see in movies and music videos.

In the 3d modeling, and in 3d computer games, the same techniques are also used to give some type of colored pattern to flat and curved surfaces in a method known as 'Texture Mapping'. This can involve sub-divided images into grids and meshes that approach a single pixel.

All the above are very closely related, and all basically involve the exact same thing, distorting images by looking up a pixel color basied on a specific desination coordinate. What term should be used... Take your pick.

For an alternative discussion of distortion transforms, see Leptonica, Affine Implementation and specifically its discussion of 'point-wise' method. The other method, 'sequential', is essentially how IM implements its Rotate and Shear distortion operators (see above).

Direct Interpolated Pixel Lookup

There are still a few problems with the above Reverse Pixel Mapping technique. First of all is that when mapping a pixel from an integer position on the destination you can end up with a non-integer position on the source. That is a location that falls between pixels on the source image. To determine what color should be returned a process called Interpolation is used to determine the final color for that real position by mixing the colors of the surrounding pixels.

Then you have the problem of what to do when the mapped position 'misses' the source image completely. What color should be returned is determined by the Virtual Pixel setting, which can let you pick a color such as, the color of the nearest edge of the source image, pretend the source image is infinitely tiled (or mirror tiled), or use some specific color such as 'white', black, or 'transparent' or the user defined background color.

There is also the possibility that there is no mathematically valid source position for the destination position being mapped. For example the pixel looks into the 'sky' of a perspective 'plane' (See Viewing Distant Horizons). This should have a completely different response to just a normal 'miss' of the source image, as it never even 'hits' the source image space.

The Interpolation setting will also handle the case that part of an image becomes 'stretched out' so that a single source pixel becomes smeared over a large area of the destination image. However the opposite is not handled very well by a simple interpolation method. And that requires other techniques.

Super Sampling, Improving Distortion Results

If part of the source image gets compressed into a much smaller area, the color of the destination image in that area should probably be a mix of a number source pixels, over a larger area. However only one single color lookup is performed by a simple reverse pixel mapping.

The result is Aliasing Artifacts, producing moire effects when distorting textures, edges with pixel level 'staircase' effects, and thin lines looking more like dotted lines.

One solution to this to take a lot more color readings from the source image, so as to try to determine a more correct color for each pixel in the destination image. The simplest technique is generally know as super-sampling, or over-sampling. See Wikipedia Entry on Super-Sampling.

Instead of taking just one color (sample) from the source image for each destination pixel that is generated, but a larger number of samples, are taken over the 'area' of each output pixel.

By taking more samples from the source image for each destination pixel, the final color of the individual pixel will become a more accurate representation of distorted image at that point. The more color samples you make, the more accurate the final color will be, and the results produce a smoother more realistic look.

Remember this technique only really improves the look of the destination in areas where the source image becomes compressed by more than 50%. In areas where the distortion enlarges the source image, or keeps it about the same scale, a Interpolated Lookup of the source image look up will generally produce a good result with just one single lookup.

In Imploding Images (above), I touched briefly on the simplest method of 'super-sampling'. Enlarging the size of the output image (in this case by enlarging in the input image), and then performing the distortion. Afterward resizing the image back to normal again.

For example...

  convert -size 94x94 xc:red -bordercolor white -border 3 \
          -virtual-pixel tile             -implode 4
          implode_tiled_box.gif
  convert -size 94x94 xc:red -bordercolor white -border 3 \
          -virtual-pixel tile  -resize 400%  -implode 4 -resize 25% \
          implode_tiled_ss.gif
[IM Output]
Normal Implosion of a Box Image
[IM Output]
Super Sampled Implosion

Of course rather than enlarging the image, you could either generate a larger image, or apply it to a larger source image, before you do your final resize.

This is especially useful when rotating photos, and text, the source of which is usally much larger that the final result, and where the quality of the resulting image can be highly noticeable. For examples of this see the Polaroid Transform.

As of IM v6.4.2-6, the General Distortion Operator, can generate an enlarged output image which you can scale (or resize). (See )

See the next example.

This is only one method of super sampling, others are variations on this method.

One final word of warning. Super-sampling is limited by the number of samples that was used for each pixel in the final image. That is the amount of scaling used in the final resize, determines the final 'quality' of the distorted image. But using a larger image will of course be much much slower to generate, but even it has limits.

In the extreme, super-sampling will not handle image distortions that involves infinities (such as in the center of an imploded image). In such cases a completely different technique is needed, one that is provided by Area Resampling (see below).

In summery, super-sampling can improve the look of images with only minor distortions, such as rotations, and shears. But as you will see below, it has limits to the types of distortions that it can improve.

Adaptive Sampling

The super-sampling technique can be expanded further. Rather than just using a fixed number of color lookups for each pixel, a check is made on either the distance between the lookups in the source image, or on the colors returned, to see if we should make more samples for that specific pixel.

That is the amount of super-sampling could be made responsive to needs of the distortion, without knowing anything about the specifics of the distortion itself. This is Adaptive Super-Sampling.

This technique is actually very common in Ray Tracers, where it is next to impossible to determine just how complex the resulting image is at specific points. In this case it is often restricted to using 'color differences' to determine when more samples are needed.

IM does not currently support adaptive super-sampling at this time. Though it is now quite possible to add alternative sampling methods to use by the General Distortion Operator (see below).

Area Resampling, for better Distortions

The best alternative to super-sampling methods is Area Re-sampling.

Rather than distorting a larger image and averaging the results by resizing, Or just taking and averaging more samples from the image, we actually determine exactly how many pixels from the source image should be merged together (based on the 'scale' of the distortion at that point) to generate each specific output pixel. That is figure out the 'area' within the source image, each output pixel represents.

In fact this is exactly what the ImageMagick Resize Operator (a simpler form of image distortion) does to generate such good results. However for resize, you only need to calculate the scale and area needed to be sampled, once for the whole image.

When area re-sampling a distorted image, the area of pixel being generate covers will change with position. Some pixel may only need to merge a few source image colors, or even just one color, while another pixel elsewhere in the image may need to a very large number of pixels to generate the correct color.

Also the area that a destination pixel represents in the source image, may not be a simple square or circle, but may actually be a very weird awkward distorted shape, according to the distortion being used. Calculating and handling such awkward shapes can be very time consuming, or near impossible to achieve. As such it may be better to only approximation of the area of the source image needed.

For example here is a diagram showing how a round pixel in the final image needs to use the colors from a larger, elliptical area in the source image.
[IM Output]

Using an elliptical area of the source image to calculate colors for each destination pixel, is a method known as Elliptical Weighted Average (EWA) Re-sampling, and was outlined in the PDF research paper "Fundamentals of Texture Mapping and Image Warping" by Paul Heckbert. This was then used to define the new Generalized Distortion Operator (see next).

The results of this algorithm is especially good for extreme scale reductions such as produced by perspective distortions. For example here are all three re-sampling methods for an infinitely tiled perspective image. See Viewing Distant Horizons below for details.

  # input image:  checkerboard with a gold outline.
  convert -size 90x90 pattern:checkerboard -normalize -fill none \
          -stroke gold -strokewidth 3 -draw 'rectangle 0,0 89,89' \
          checks.png

  # Using Direct Interpolated Lookup  (Perspective View)
  convert checks.png  -filter point \
          -virtual-pixel tile -mattecolor DodgerBlue \
          -distort Perspective '0,0 20,60  90,0 70,63  0,90 5,83  90,90 85,88' \
          horizon_tile_point.png

  # Using Grid Super Sampling
  convert checks.png  -filter point  -set option:distort:scale 10 \
          -virtual-pixel tile -mattecolor DodgerBlue \
          -distort Perspective '0,0 20,60  90,0 70,63  0,90 5,83  90,90 85,88' \
          -scale 10%    horizon_tile_super.png

  # Using Area Resampling (default)
  convert checks.png       -virtual-pixel tile -mattecolor DodgerBlue \
          -distort Perspective '0,0 20,60  90,0 70,63  0,90 5,83  90,90 85,88' \
          horizon_tile.png
[IM Output]
Direct Reverse Pixel
Interpolated Lookup
[IM Output]
Super Sampling x10
[IM Output]
Elliptical Weighted Area
Resampled Image

The last image was generated using the default EWA settings of the Generalized Distortion Operator (see below). It took 4.6 seconds to generate. Which is rather reasonable.

The first image however is exactly the same, except that EWA resampling has been turned off by using a "-filter point" setting. This forces it to use Direct Interpolated Lookup for each pixel. As such this image was generated extremely fast in comparison (.51 seconds).

The middle image is like the first image but with the image being enlarged, and all the coordinate settings being multiplied by ten. That is more than 100 pixels were used to Super Sampling each destination pixel. It is quite fast to generate (1.2 seconds), and while it improves the quality of the image in general, that improvement is limited, and depends on how much super-sampling was provided. The ×10 used in the above example is very heavy.

The biggest difference between super-sampling and area-resampling, is that the former only does a general improvement in quality over the whole image. As the distortion gets more sever it starts to break down showing Resizing Artifacts in the middle ground, and a line of server moire effects just before the horizon, when the 10 samples across per pixel matches the checker board pattern close to the horizon.

On the other hand area resampling concentrates more on the problem pixels closer to the horizon (where it spends most of its time), than on foreground pixels, removing those artifacts. For a simple distortions it is usually a lot faster than super-sampling, but will always be slower than direct interpolated lookup.

A simple ellipse used by EWA, may not be perfect for all distortions. For example the "DePolar" distortion actually requires a curved circular arc for its ideal area resampling. Because of this you may better off using Super Sampling for some specific distortions.


Generalized Distortion Operator

With the generation of these examples, the ensuing discussions in the IM Forums, and multiple requests from users for easier and faster ways to do perspective and other distortions, a new operator was added to IM v6.3.5-1 to allow us to more easily add new types of image distortions.

The General Distortion Mapping operator is called "-distort", and you can see what distortion methods it has available on your IM version using "-list Distort".

  convert -list distort

The "-distort" operator takes two arguments, one of the distortion methods as given by the above, and a second string argument consisting of comma or space separated floating point values.

The number floating point values given is however highly dependant on the distortion method being used, and their meanings also depend not only on the method chosen, but also can depend on the exact number of values given.

This is especially the case for the 'Scale_Rotate_Translate' (or 'SRT' for short) distortion, which really combines three separate 'Affine' distortions into a single distortion.

As discussed above in the Reversed Pixel Mapping, the resulting image (for magnification) is also effected by the Pixel Interpolation setting "-interpolate" to define how to lookup pixels from the source image, (especially in areas that become enlarged). For minification of the source image the Resize Filter settings controls the merging of pixel colors in the Area Resampling algorithm. However using a "-filter point", will turn off Area Resampling, forcing the use of a fast and simple Direct Pixel Lookup for the source color values.

Finally the Virtual Pixel setting defines what the pixels surrounding the image should look like. while the color setting "-mattecolor" defining the color of pixels when the distortion mapping is undefined (for example the sky in a perspective distortion).

Phew. That is a lot of settings that you can use, but really you generally don't need all of them. Just define them as you need to.


By default "-distort" will usually distort the source image(s) into a image that is the same size as the original image. There are exceptions to this, such as the 'Arc' distortion (a polar mapping variant) where the input source image size really does not have much meaning in the distorted form of the image (see Arc Distortion below for details).

The other form of the operator, "+distort" (Added to IM v6.3.5-7), will resize the distorted image so it will contain the whole of the input image, much like what the Rotate and Shearing operators do.

However this particular 'mode' of operation also goes further an also sets the Virtual Canvas Offset (page) of the resulting image. This way if you later Flatten this image onto another image the position of your control points, and origin will still be in the correct position, even though the positions within the actual image itself many not be as you would expect.

Also (depending on the distortion method) a "+distort" will attempt to take into account any existing Virtual Canvas Offset that may be present in the source image, in the distortion process.

As such you may need to make judicious use of the "-repage" attribute setting operator to clear or adjust that offset when using the 'best-fit' "+distort" form of the General Distortion Operator, if these offsets are not wanted, or you don't want it to effect the outcome of the distortion operation. See also Removing Canvas/Page Geometry.

The "-distort" will ignore any existing offset present in the source image in terms of the distortion itself, but will copy that offset unchanged to the distorted image.

In Summary... Use "-distort" to have results mapped into an image the same size. And use "+distort" to size the output to best-fit the distorted image, BUT also use and generate Virtual Canvas Offsets (page attributes).

Distort Options

verbose

By setting "-verbose" before running "-distort" (use "+verbose" to turn off again), distort will output to the standard error channel information on the algorithm and internal coefficients it calculates to distort the given image, in the given way.

You can use this information to look at and understand how the distortion works and is applied. It is also a debugging tool we can use to figure out what is going wrong, and as part of the implementation process for new distortions.

For example here is a verbose SRT 'no-op' distort that will not change the distorted image at all...

  convert koala.gif -verbose -distort SRT 0 +verbose  koala_noop.gif
[IM Output] ==>
[IM Text]
==> [IM Output]

The above output shows two alternative distortion methods for the given distortion. One is an 'AffineProjection' distortion, while the other shows a DIY FX Operator alternative detailing exactly what the distortion is doing.

Viewport

-set 'option:distort:viewport' WxH+X+Y
Was added in IM v6.3.6-1, which will set the size and location of the distorted image. The viewport is however limited to whole integers, and not floating points. It does not enlarge the distorted image itself, just defines the 'window' or 'viewport' into the distorted space that you want to view.

This can be used to create a destination image of a specific size, or shift it to a specific offset in the distorted image space, or you can think of it as a simple 'Viewport Crop' of the possibly infinite distorted space.

For example, here we crop the output to just the koala's head.

  convert koala.gif  -set option:distort:viewport 44x44+15+0 \
          -distort SRT 0  +repage koala_viewport.gif
[IM Output]

And here we add extra space around the image, showing the default Virtual Pixel, Edge setting effects that distort can see...

  convert koala.gif  -set option:distort:viewport 125x125-25-25 \
          -distort SRT 0  +repage koala_viewport_2.gif
[IM Output]

The final "+repage" in both the previous examples is needed to remove the virtual canvas offset that "-distort" will leave behind when a viewport is used.

Output Scaling

-set 'option:distort:scale' N
Was added in IM v6.4.2-6, as a general output image scaling factor. This enlarges the output image by the given 'N' and thus the "-distort" will need to generate N2 more distorted lookup 'samples'.

Note that many distortions also allow you to 'scale' the size of resulting distorted image, however the resulting image size would be unaffected by that operators scaling (unless a 'best-fit' "+distort" was used) and the viewed location might also be shifted. This 'scale' setting however does not change the contents of resulting image at all, just enlarges it.

This can be used for example with an appropriate 'viewport' to produce a image that you can easily "-resize" to a specific size, allowing you generate a controlled 'zoom' into the distorted image, without loss of quality.

For example, we 'zoom' in on the head of the koala.

  convert koala.gif -set option:distort:scale 2 \
          -set option:distort:viewport 44x44+15+0 \
          -distort SRT 0  +repage koala_zoom.gif
[IM Output]

Note that while the viewport was requested to be 44x44 pixels, the actual output image has been scaled to 88x88 pixels.

More commonly it is used as a simple means of 'Super Sampling' (see above) the distortion operation. For this a integer 'super-sampling' scale factor is used, and after distorting the image the image is scaled back to this original size, to merge the extra samples together.

Also as 'Area Re-Sampling' is not needed for 'Super Sampling' (it only slows it down), it is typically turned off by using a "-filter point" option.


  convert koala.gif -filter point -set option:distort:scale 10 \
          -distort SRT 0  -scale 10%   koala_super.gif
[IM Output]

Note that the resulting image is slightly more blurry than the original image. This is due to the default Interpolate Setting used to lookup the colors, the results of which super-sampling averages.


Scale-Rotate-Translate (SRT) Distortion

The simplest distortion, but probably one of the versatile, is the 'SRT or 'Scale-Rotate-Translate' distortion. (SRT is just a quick short-hand)

You have already seen the 'no-op' example of this distortion in the last few examples, where the image is processed without any distortion being applied at all. But this is only the star of what it can do.

This distortion is actually three separate, distortions in a single distortion method. All arguments except the angle rotation, are optional, and this makes the argument meaning highly variable in meaning, depending on exactly how many comma separated arguments you give, up to the maximum of 7 floating point numbers.

-distort SRT " 
                   Angle 
 "   -> centered rotate
        Scale     Angle
  -> centered scale and rotate
X,Y               Angle 
  -> rotate about given coordinate
X,Y     Scale     Angle 
  -> scale and rotate about coordinate
X,Y ScaleX,ScaleY Angle 
  -> ditto
X,Y     Scale     Angle  NewX,NewY
  -> scale, rotate and translate coord
X,Y ScaleX,ScaleY Angle  NewX,NewY
  -> ditto

What this does is take an image in which you have selected a (optional) control point. If no control point is given, the center of the input source image is used. Around that point the distortion will, in sequence... Scale the image, Rotate it, then Translate or move the selected control point to a new position. Hence the name of this distortion.

For example lets take our koala image, and just simply rotate it, in a similar way to the Rotate Operator, but without expanding the size of the input image...

  convert koala.gif  -background skyblue  -virtual-pixel background \
          -distort ScaleRotateTranslate -110 koala_srt_rotate.png
[IM Output]

Using the 'plus' form of "+distort", and a clean up of resulting virtual canvas offsets, we can make it exactly like the Rotate Operator, but with better results.

  convert koala.gif  -background skyblue  -virtual-pixel background \
          +distort ScaleRotateTranslate -110 +repage koala_srt_rotate2.png
[IM Output]

Lets shrink it by 30% as well, but use a transparent background.

  convert koala.gif  -matte -virtual-pixel transparent \
          +distort ScaleRotateTranslate '.7,-110'  koala_srt_scale.png
[IM Output]

The next set of arguments will specify the 'center' around which the image is rotated and scaled. This point is called a 'control point' or 'handle' in the image which is a location used to control the distortion. As we are using a specific point for this distortion, lets not use the 'best-fit' mode to avoid the complications of 'virtual offsets'.

For example lets rotate and scale the koala around its 'nose', which is located at 28,24 in the source image. While we are at it lets distort the X and Y scales different.

  convert koala.gif  -background skyblue -virtual-pixel background \
          -distort ScaleRotateTranslate '28,24  .4,.8  -110' \
          koala_srt_center.png
[IM Output]

And as a final example, lets also move the 'nose' to near the bottom of the image, and set background to a matching white background.

  convert koala.gif  -virtual-pixel white \
          -distort ScaleRotateTranslate '28,24  .4,.8  -110  37.5,60' \
          koala_srt_trans.png
[IM Output]

Note that the final position is also a floating point value. In fact all the arguments can be floating point values and the distortion will do the right thing.

Remember each of the operations, Scale, Rotate, and Translate are performed in that order.

As you can see this distortion is very versatile, and while you can think of it as distorting the image using three different methods in sequence, in reality it is applying all three distortions simultaneously to produce the shown result. This makes it faster than doing multiple individual operators, and generally produces a better final result.

The above also demonstrates the use of different Virtual Pixel settings to define the color used for the areas referenced outside the actual source image. To see the effect of Interpolation on rotations see Interpolation with Affine Rotates. It is a much better rotate that the 'sheared' form of rotate provided by the Rotate Operator

This distortion specifically designed to take an image of some known object and generate an animation based on the movements and rotation of that object.

For example here I create a stylized space ship, which I then animate in a very rough way. The ship sits on its base at 20,75 (for the initial 'hunker-down' scaling) while the normal 'handle' for movement and rotations is the ships center which is located at 20,60 in the original image. These points represent control points by which the object can then be animated in simple terms.

  convert -size 80x80 xc:skyblue -fill yellow -stroke black \
          -draw 'path "M 15,75 20,45 25,75 Z  M 10,55 30,55" ' \
          spaceship.gif
  convert spaceship.gif  -virtual-pixel background \
          \( -clone 0  -distort SRT '20,75  1.0,0.6  0' \) \
          \( -clone 0  -distort SRT '20,60     1     0  20.5,49.5' \) \
          \( -clone 0  -distort SRT '20,60    0.9   20  27,35' \) \
          \( -clone 0  -distort SRT '20,60    0.8   45  40,23' \) \
          \( -clone 0  -distort SRT '20,60    0.5   70  55,15' \) \
          \( -clone 0  -distort SRT '20,60    0.3   75  72,11' \) \
          \( -clone 0  -distort SRT '20,60    0.1   80  100,8' \) \
          -set delay 50  -loop 0  spaceship_launch.gif
[IM Output] ==> [IM Output]

Of course it is a very rough example of how you can use a 'SRT' distortion to animated a static image, but you should get the idea. You can add more frames, and perhaps some flames and smoke to improve it further (submissions welcome and best result will be added here with your name).

Distortions Using Control Points

While most of the 'SRT' distortion method is defined by specifying rotation angles and scaling factors, it also allows you to use two points to control the position of the object. One point is a location on the 'source' image, while a second point defines where that control point should move.

A lot of different distortion methods makes use of control points, which are given in the form of 4 values (2 pairs of coordinates) for each single control point. So often a distortion is defined in terms of multiple sets of 4 values. For example....
U1,V1 I1,J1     U2,V2 I2,J2     U3,V3 I3,J3     U4,V4 I4,J4 . . . .
Where the control point Ui,Vi, on the source image (relative it its virtual canvas), is mapped to Ii,Ji on the distorted destination image.

Before IM version 6.3.6-0 when the Distort Operator operator was first introduced, the coordinate ordering was defined as all the source coordinates, followed by all the destination coordinates. This however make it hard to determine which source and destination coordinates corresponded to each other.

The change in ordering also allowed for the use of more than a minimal number of control points for least squares fitting of the distortion to the control points. That in turn permitted the use of future 'image registration' techniques for automatic determination of control points.

It is defined in this way so that the movement of each individual control point is kept together in the comma (or space) separated list of floating point values. It also allows for the future use of 'control point files', where lines are either a command (starting with '#') or defines the movement of a single control point.

The simplest distortion using control points is the 'Affine' distortion, though this as you will see later is usually defined in terms of three points.

For example, like the 'SRT' distortion we can 'translate' an image by giving just one set of control point pairs. For example here we move the 'nose' of our koala image at '28,24' to the new position '45,40' (as indicated by the red arrow).

  convert koala.gif  -virtual-pixel white \
          -distort Affine '28,24 45,40'   koala_one_point.png
[IM Output] ==> [IM Output]

With two points, the 'Affine' distortion can not only translate a image but scale and rotate it as well. For example here I map the 'ears' to the koala (the red line from '30,11' and '48,29'), to a horizontal position (a blue line from '15,15' to '60,15').

  convert koala.gif  -virtual-pixel white \
          -distort Affine '30,11 15,15  48,29 60,15'   koala_two_point.png
[IM Output] ==> [IM Output]

Of course an 'SRT' distortion could have reproduced the above two point 'Affine' distortion, except that here we defined the distortion using control points instead of scale, rotation, and translation arguments. Which form you should use is up to you, and what you are trying to achieve.

Affine Distortion (a three point distort)

Both the 'SRT' distortion, and the one and two point forms of the 'Affine' distortion shown above are actually simplifications of a full 3 point form of the 'Affine' distortion. In fact if you study the "-verbose" output of any 'SRT' distortion (see verbose distort setting for an example) you will find that internally it really is a 'AffineProjection' distortion (see below).

The only distortion effect that the above methods could not handle fully was 'shears' similar to what the Shear Operator would provide. For that you need to use a three point affine distortion. You can think of a three point distortion, by imagining the first coordinate mapping as a 'origin' with the other two coordinate mappings as vectors from that origin.

For example here I draw some text, and overlay a red and blue 'vector' to define the three control points relative to that text. Now by moving the coordinates of those two lines, we can translate, rotate, scale and shear that text image, to fit the new location of those lines.

  convert -background lightblue -fill Gray -font Candice \
      -size 100x100 -gravity center label:Affine\! \
      -draw 'fill blue stroke blue path "M 2,60 32,60 M 27,58 27,62 32,60 Z"' \
      -draw 'fill red  stroke red  path "M 2,60  2,30 M  0,35  4,35  2,30 Z"' \
      label_axis.png
  convert label_axis.png  -virtual-pixel background \
          -distort Affine '2,60 2,60     32,60 32,60    2,30 17,35' \
          label_axis_distort_shear.png
  convert label_axis.png  -virtual-pixel background \
          -distort Affine '2,60 2,60     32,60 27,85    2,30 27,35' \
          label_axis_distort_rotate.png
  convert label_axis.png  -virtual-pixel background \
          -distort Affine '2,60 30,50    32,60 60,80    2,30 30,5' \
          label_axis_distort_affine.png
[IM Output] ==> [IM Output] [IM Output] [IM Output]

In the first example only the third coordinate (for the vertical red line) was modified causing the image to be sheared. Later examples make more radical changes to the image, including scaling, and rotations.

Of course the Annotate Text operator can also skew actual text in this same way, though without any posible size changes, (see Annotate Argument Usage). Affine however distortion can do this for any image, and not just simple text.

Affine Least Squares Fit

If you supply more than 3 control points, to an 'Affine' distortion, ImageMagick will perform an least squares average over all those points to find the best representation for a 3 point, 'Affine Projection'.

This means if you are trying to match up one image with another image ('Image Registration'), you can define more than 3 points so that the result will be a more precise distortion to match up the images.

Of course if one or more of those points do not 'fit' an 'Affine Projection', then the result will be only a rough fit that best represents all the control points given.

FUTURE: example needed

3d Cubes, Affine Layering

The 'Affine' distortion, with its control points is ideal for generating Orthographic, and Isometric Cubes (see Wikipedia, Orthographic Projection and Isometric Projection for definations), from three images. All that you need to do is figure out four control points on a destination image.

As we will be using a Image Layering Technique the points can even have negative values, and allows IM to adjust the final image size accordingally to the generated warpped images.

For this example I'll choose the control points '0,0' for the center of the cube, and three points equally spaced around that centeral point, at '-87,-50', '87,-50', and '0,100'. All that I then need to to is map the appropriate corners of three (preferably square) images to these control points.

  convert -virtual-pixel transparent \
          \( lena_orig.png -matte \
             +distort Affine '0,511 0,0   0,0 -87,-50  511,511 87,-50' \) \
          \( mandrill_orig.png -matte \
             +distort Affine '511,0 0,0   0,0 -87,-50  511,511 0,100' \) \
          \( pagoda_sm.jpg -matte \
             +distort Affine '  0,0 0,0   0,319 0,100    319,0 87,-50' \) \
          -background black -layers merge +repage \
          -bordercolor black -border 10x5     isometric_cube.png
[IM Output]

Affine Projection Distortion

As I have already mentioned, the various arguments of an 'SRT' distortion and the control points of an 'Affine' distortion, are mathematically transformed into 6 special numbers which represent the 'coefficients' of an 'Affine Projection'.

The 6 floating point arguments are...
sx, rx, ry, sy, tx, ty

If you already have these coefficients pre-calculated (say extracted from the verbose output of distort, or calculated them yourself using other methods from other forms of input arguments, then you can directly supply them to IM to distort the image.

For example, here I 'shear' the image but using an angle to calculate the coefficents, rather than the movement of control points.

   angle=-20
   sine=`convert xc: -format "%[fx:sin( $angle *pi/180)]" info:`
   convert koala.gif -matte -virtual-pixel Transparent \
           +distort AffineProjection "1,$sine,0,1,0,0" +repage \
           koala_affine_proj.png
[IM Output]

The older way of doing this was to use the "-affine" and "-transform" operational pair. However as of IM v6.4.2-8 this is just a simple call to 'AffineProjection' using the 'plus' or 'bestfit' form of the Distort Operator.

Affine Tiling

All three of the above affine-like distortion methods we have looked at so far, also provides interesting ways to generate various tiling patterns, based on a distorted image.


  convert checks.png    -matte    -virtual-pixel tile \
          -distort  ScaleRotateTranslate  '20,20  .5  30' \
          checks_srt_tile.png
  convert checks.png    -matte    -virtual-pixel tile \
          -distort  Affine  '0,0 10,10   0,89 10,50   89,0 50,0' \
          checks_affine_tile.png
  convert checks.png    -matte    -virtual-pixel tile \
          -distort  AffineProjection  '0.9,0.3,-0.2,0.7,20,15' \
          checks_amatrix_tile.png
[IM Output] ==> [IM Output] [IM Output] [IM Output]

Using a distortion mapping in this way is actually how 'texture mapping' works in 3D graphic libraries and games. The only difference is that they map 3 dimensional coordinates of surfaces, back to a two dimensional image.

Even the 'no-op' distortion ("-distort SRT 0"), with an appropriate Distort Viewport provides a useful way of tiling whole sequence of images such as Animated Glitter Tiles.

  convert glitter_blue.gif -virtual-pixel tile \
          -set option:distort:viewport 100x100 -distort SRT 0 \
          glitter_blue_tiled.gif
[IM Output] ==> [IM Output]


Perspective Distortion (a four point distort)

Probably the most common requested type of distortion, has been for a fast perspective distortion operation. This is a 4 point distortion, so requires at least 4 sets of control point pairs, or 16 floating point values.

For example, here I have a image building. From this image I manually discovered the location of 4 points (green). I also defined the final location to which I those points transformed to in the final image (blue), so as to 'straighten' or 'rectify' the face of the building.

  convert building.jpg \
          -draw 'fill none stroke green polygon 7,40 4,124, 85,122, 85,2' \
          building_before.jpg
  convert building.jpg \
          -draw 'fill none stroke blue polygon 4,30 4,123, 100,123, 100,30' \
          building_after.jpg
[IM Output] ==> [IM Output] [IM Output]

To do the actual image distortion, you only need to feed those coordinates into the 'perspective' method of "-distort".

  convert building.jpg -matte -virtual-pixel transparent \
         -distort Perspective \
              '7,40 4,30   4,124 4,123   85,122 100,123   85,2 100,30' \
          building_pers.png
[IM Output] ==> [IM Output]

Notice the blank area on the top right, where the distortion 'missed' the pixel data in the source image. What IM does in this situation is controlled by the "-virtual-pixel" setting (see Virtual Pixel).

What is less noticeable is that a small amount of the left-most edge of the original image is also 'lost' for the same reason.

As a matter of interest lets also reverse the distortion, by swaping the coordinates of each mapping pair. This lets us see just how much of the image is degraded by the distortion.

  convert building_pers.png  -matte -virtual-pixel transparent \
         -distort Perspective \
              '4,30 7,40   4,123 4,124   100,123 85,122   100,30 85,2' \
          building_pers_rev.png
[IM Output] ==> [IM Output] ==> [IM Output]

Not bad. A lot of 'fuzziness' is present, but that can't be helped. Notice that the 'fuzziness' is worse on the right side of the image where it was compressed the most. All distorts suffer from this compression problem, as such you should always try to distort from an original image, rather than distorting an already distorted image.

Here is another example, of using this transform, using the special checkerboard test image we created above, which we distort then reverse the distortion.

  convert checks.png        -matte    -virtual-pixel transparent \
          -distort Perspective '0,0,0,0  0,90,0,90  90,0,90,25  90,90,90,65' \
          checks_pers.png
  convert checks_pers.png   -matte    -virtual-pixel transparent \
          -distort Perspective '0,0,0,0  0,90,0,90  90,25,90,0  90,65,90,90' \
          checks_pers_rev.png
[IM Output] ==> [IM Output] ==> [IM Output]

You can see the slight fuzziness caused by the distortion, but the image is basically restored.

What actually happens is that IM uses all the control point pairs given to calculate the appropriate coefficients for a 'Perspective Projection' (see next). If you include a Verbose setting, you can see both the coefficients, and the DIY FX Equivalent that is being used internally by IM to perform this distortion.

If only 3 or less control point pairs are provided, IM will automatically fall back to the simpler 'Affine' distortion. While more that 4 points (for 'Image Registration') will be least squares fitted to find the best fitting distortion for all the given control points.

FUTURE: Alternative. The four coordinates could also represent a triangle and center point. You can fix the triangle and move the center point, or fix that center and move the other three coordinates, to generate the perspective view.

If you like to see more detail in what IM actually does and the mathematics involved see DIY Perspective Distortion. You can also see a Postscript implementation that was presented in a PDF paper Perspective Rectification, by Gernot Hoffmann. Also have a look at Leptonica Affine and Perspective Transforms.


Viewing Distant Horizons

You can produce some very unusual effects using Perspective Distortions if you adjust the coordinates to produce a 'vanishing point' within the boundaries of the image.


  convert checks.png -mattecolor DodgerBlue \
          -virtual-pixel background -background Green \
          -distort Perspective '0,0 20,60  90,0 70,63  0,90 5,83  90,90 85,88' \
          checks_horizon.png
[IM Output]

Well we used 'green' for the surrounding virtual pixels of this image, which we enabled using Virtual Pixel Background Settings. But what is more interesting is the appearance of the 'blue' color that was defined using the "-mattecolor" setting.

This 'blue' color represents an area where the pixels generated by the distortion is invalid, and in such areas the "-distort" operator will just output the "-mattecolor" setting.

For a Perspective Distortion, any pixel ending up in the 'sky' of the resulting image will be classed as invalid. Also it defines the 'sky' as being the side of the 'horizon' on which the source image will not appear. It will only appear when images are highly foreshortened by the distortion.

If you don't want a 'sky' in your final image result then the best idea is to set both "-background" and "-mattecolor" to use the same color.

The Perspective Distortion gets more interesting when one of the special infinite tiling Virtual Pixel settings are used. For example here we used a 'tile' setting to generate a infinitely tiled plane.

  convert checks.png  -virtual-pixel tile -mattecolor DodgerBlue \
          -distort Perspective '0,0 20,60  90,0 70,63  0,90 5,83  90,90 85,88' \
          horizon_tile.png
[IM Output]

A word of warning about this image. Asking for an infinitely tiled image is very slow to generate. The larger the image the slower it gets. You can monitor the progress of the "-distort" (or any other slow image processing task) using the "-monitor" Operational Control Setting.

Basically for a single pixel that is close to the horizon, ImageMagick will need to average a huge number of pixels from the original image to figure out the appropriate color. This can take a very long time. ImageMagick does try to limit the amount of time it uses to handle these near-horizon pixels, but it can still take a long time.

For more details of this method see Area Resampling above.

Another infinitely tiled perspective image can be generated by using a Random Virtual Pixel Setting...

  convert checks.png  -virtual-pixel random -mattecolor DodgerBlue \
          -distort Perspective '0,0 20,60  90,0 70,63  0,90 5,83  90,90 85,88' \
          horizon_random.png
[IM Output]

What is happening is that all virtual pixels surrounding the image are just random picks of any pixel within the image itself. The result is a ground consisting of random noise that gets smoother and more blurred as you look toward the horizon of the image. It gives a natural feeling of depth, without any specific repeating pattern.

Here I repeated the above but with a pure black and white source image. However I am not interested in the actual distorted image, only the Virtual Pixel 'random' pattern that was generated, so I changed the what part of the 'distorted image space' I am looking at, by using a special '-set option:distort:viewport' setting. This setting overrides the normal size and location of the area of distorted space being viewed.


  convert -size 90x90 pattern:gray50 -matte \
       -virtual-pixel random -mattecolor none \
       -set option:distort:viewport 120x120+100-15 \
       -distort Perspective '0,0 20,60  90,0 70,63  0,90 5,83  90,90 85,88' \
       +repage -size 120x50 gradient:dodgerblue-tomato \