conversion to pdf using aws lambda / node.js does not work

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
Roberto
Posts: 6
Joined: 2019-07-04T07:03:14-07:00
Authentication code: 1152

conversion to pdf using aws lambda / node.js does not work

Post by Roberto »

I am new to imagemagick and am trying to convert ".png" and ".jpg" image files to pdf using an aws lambda / node.js function, but the result is a file without the pdf files header.
Can anyone tell me if there is any restriction for this?
I am reading a file from aws S3, saving in /tmp and then executing the command:

im.convert([('/tmp/input-'+ srcKey), ('/tmp/' + dstKey)], function(err, res) {

where srcKey is the original filename (".jpg") and dstKey is the output filename with the ".pdf" extension.

The output file is generated but without the pdf header like: (below the header of the file generated when I execute the command by the command line.)

Code: Select all

%PDF-1.3
1 0 obj
<<
/Pages 2 0 R
/Type /Catalog
>>
endobj
2 0 obj
<<
/Type /Pages
/Kids [ 3 0 R ]
/Count 1
>>
endobj
3 0 obj
<<
...
if I change the file extension to ".jpg" or ".png" I can see the generated file, but not with the ".pdf" extension. Looks like if the imagemagick was not doing the conversion.
Any idea?

Regards
Roberto
Posts: 6
Joined: 2019-07-04T07:03:14-07:00
Authentication code: 1152

Re: conversion to pdf using aws lambda / node.js does not work

Post by Roberto »

Hi Guys.
Has anyone ever converted a ".jpg" or ".png" image file to PDF using imagemagick with AWS lambda?
Using the command line in my computer I can do the conversion normally but via AWS Lambda does not.

Basically I'm reading an image file from AWS S3 and saving in the /tmp. Then I execute the IM convert command.
After the conversion I'm reading the converted file to put in the buffer and save again in the AWS S3:


//Get Image File from S3
console.log('Start GetObject.');
s3.getObject({
Bucket: srcBucket,
Key: srcKey
}).promise()
.then(function (response) {

// put image file to buffer
let buffIn = new Buffer.from(response.Body, 'base64');

// write image file to /tmp
fs.writeFile(('/tmp/input-'+ srcKey), buffIn, function(err) {
// If an error occurred, show it and return
if(err) {
console.log('error:%s',err);
} else {
//Loop through the sizes and perform resizing for each available options
_sizesArray.forEach(function (value, key) {

let dstKey = _sizesArray[key].outFilename;

// transform, and upload to S3 with different FileNames
async.waterfall([
.
.
.
.
im.convert([('/tmp/input-' + srcKey), ('/tmp/' + dstKey)], function(err, res) {

.
.
.
fs.readFile(('/tmp/' + dstKey), function(err, buffOut) {
if(err) {
console.log('error:%s',err);
next(err);
} else {
next(null, response.ContentType, buffOut);
}


Where:
/tmp/input-' + srcKey = Source Image File Name with ".jpg" extension
/tmp/' + dstKey = Destination Image File Name with ".pdf" extension

No errors is generated and the generated Destination file is identical to the source file, just with the extension changed to ".pdf".
I installed ghostscript and put it in the package sent to Lambda but it did not change anything.
Anyone have any idea what the problem might be?

Regards
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: conversion to pdf using aws lambda / node.js does not work

Post by fmw42 »

Imagemagick is a raster processor. So note that if you convert a raster image to PDF, it will be a raster image in a vector PDF shell. It will not be converted to vector. For that you need a tool such as potrace.

Note that you might have to modify your policy.xml file to permit reading and writing PDF files.

Sorry, I do not know AWS.
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: conversion to pdf using aws lambda / node.js does not work

Post by snibgo »

Note also that writing to PDF doesn't use Ghostscript. GS is needed only for reading PDF and other formats.


I don't know AWS. I suggest you check the values of srcKey and dstKey.
snibgo's IM pages: im.snibgo.com
Roberto
Posts: 6
Joined: 2019-07-04T07:03:14-07:00
Authentication code: 1152

Re: conversion to pdf using aws lambda / node.js does not work

Post by Roberto »

Thanks snibgo and fmw42.
I just strange that I can do size conversions and apply profile files, but conversion to PDF did not work.
I'll do some more testing, thank you.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: conversion to pdf using aws lambda / node.js does not work

Post by fmw42 »

As I mentioned above, check and modify your policy.xml file for permission to use PDF/EPS/PS files.
Roberto
Posts: 6
Joined: 2019-07-04T07:03:14-07:00
Authentication code: 1152

Re: conversion to pdf using aws lambda / node.js does not work

Post by Roberto »

Thanks fmw42.

In the policy.xml file I have the following lines:
</policymap>
<policy domain="module" rights="read|write" pattern="{EPS,PS2,PS3,PS,PDF,XPS}" />
<policy domain="module" rights="read|write" pattern="{GIF,JPEG,PNG,WEBP,JPG`}" />
<policy domain="coder" rights="read|write" pattern="{GIF,JPEG,PNG,WEBP,JPG}" />
<policy domain="coder" rights="read|write" pattern="{EPS,PS2,PS3,PS,PDF,XPS}" />


and using the command identify -list policy we have:

$ identify -list policy

Path: /usr/local/Cellar/imagemagick/7.0.8-53/etc/ImageMagick-7/policy.xml
Policy: Resource
name: list-length
value: 128
Policy: Resource
name: time
value: 120
Policy: Resource
name: throttle
value: 0
Policy: Resource
name: thread
value: 2
Policy: Resource
name: file
value: 768
Policy: Resource
name: disk
value: 1GiB
Policy: Resource
name: map
value: 512MiB
Policy: Resource
name: memory
value: 256MiB
Policy: Resource
name: area
value: 16KP
Policy: Resource
name: height
value: 8KP
Policy: Resource
name: width
value: 8KP
Policy: Resource
name: temporary-path
value: /tmp
Policy: System
name: precision
value: 6
Policy: Filter
rights: None
pattern: *
Policy: Path
rights: None
pattern: @*
Policy: Module
rights: Read Write
pattern: {EPS,PS2,PS3,PS,PDF,XPS}
Policy: Module
rights: Read Write
pattern: {GIF,JPEG,PNG,WEBP,JPG`}
Policy: Coder
rights: Read Write
pattern: {GIF,JPEG,PNG,WEBP,JPG}
Policy: Coder
rights: Read Write
pattern: {EPS,PS2,PS3,PS,PDF,XPS}

Path: [built-in]
Policy: Undefined
rights: None


but using the command to see the content below, that I don't know if is enough.

$ identify -list delegate | grep pdf
MacBook-Pro-de-Roberto:ResizeProcess03 robertosato$ identify -list delegate | grep pdf
doc => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
docx => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
eps<=>pdf "gs' -sstdout=%%stderr -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 '-sDEVICE=pdfwrite' '-sOutputFile=%o' '-f%i"
odt => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
pdf<=>ps "gs' -sstdout=%%stderr -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 '-sDEVICE=ps2write' -sPDFPassword='%a' '-sOutputFile=%o' '-f%i"
pdf<=>eps "gs' -sstdout=%%stderr -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 -sPDFPassword='%a' '-sDEVICE=eps2write' '-sOutputFile=%o' '-f%i"
ppt => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
pptx => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
ps<=>pdf "gs' -sstdout=%%stderr -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 '-sDEVICE=pdfwrite' '-sOutputFile=%o' '-f%i"
xls => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
xlsx => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
$

With this configuration I can convert image file to PDF in may computes, but not in the AWS Lambda.
Looking for the package created for the AWS Lambda function I don't see this file, so that may be the problem.
I'm going to do some testing to see the result of the "identify -list policy" command in aws lambda.

Thanks a lot.
Roberto
Posts: 6
Joined: 2019-07-04T07:03:14-07:00
Authentication code: 1152

Re: conversion to pdf using aws lambda / node.js does not work

Post by Roberto »

Hi guys,
Just to inform, in the AWS Lambda instance I have the follow:

for "identify -list policy" command:

/var/task\n\nPath: [built-in]\n
Policy: Undefined\n
rights: None \n
Path: /etc/ImageMagick/policy.xml\n
Policy: Coder\n
rights: None \n
pattern: EPHEMERAL\n
Policy: Coder\n
rights: None \n
pattern: HTTPS\n
Policy: Coder\n
rights: None \n
pattern: HTTP\n
Policy: Coder\n
rights: None \n
pattern: URL\n
Policy: Coder\n
rights: None \n
pattern: FTP\n
Policy: Coder\n
rights: None \n
pattern: MVG\n
Policy: Coder\n
rights: None \n
pattern: MSL\n
Policy: Coder\n
rights: None \n
pattern: TEXT\n
Policy: Coder\n
rights: None \n
pattern: LABEL\n
Policy: Path\n
rights: None \n
pattern: @*\n

for "dentify -list delegate | grep pdf" commad":

/var/task\n
eps<=>pdf \"gs\" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 \"-sDEVICE=pdfwrite\" \"-sOutputFile=%o\" \"-f%i\"\n
pdf<=>eps \"gs\" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 \"-sDEVICE=epswrite\" \"-sOutputFile=%o\" \"-f%i\"\n
pdf<=>ps \"gs\" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 \"-sDEVICE=pswrite\" \"-sOutputFile=%o\" \"-f%i\"\n
ps<=>pdf \"gs\" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 \"-sDEVICE=pdfwrite\" \"-sOutputFile=%o\" \"-f%i\"\n"

With this setting I could convert an image file to PDF?

Thanks
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: conversion to pdf using aws lambda / node.js does not work

Post by fmw42 »

So is this resolved? If so, what was the solution?
Roberto
Posts: 6
Joined: 2019-07-04T07:03:14-07:00
Authentication code: 1152

Re: conversion to pdf using aws lambda / node.js does not work

Post by Roberto »

Hi fmw42.
No, from what I saw, in the instance where the function runs, the policy.xml file is not enabled to work with PDF files.
I am looking to see if it is possible to somehow change the aws lambda instance policy.xml configuration.

Regards
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: conversion to pdf using aws lambda / node.js does not work

Post by fmw42 »

You may also have to edit the delegates.xml file to put the full path to Ghostscript in all entries where it says:

command="&quot;gs&quot;

Insert your path to ghostscript

command="&quot;path/gs&quot;

That is a common issue which PHP Imagick, since it does not use the $PATH environment variable for the shell.


Please report back, especially if you resolve it, since this is a common issue I see reported on stackoverflow also.
chaddjohnson
Posts: 14
Joined: 2018-11-27T20:59:27-07:00
Authentication code: 1152

Re: conversion to pdf using aws lambda / node.js does not work

Post by chaddjohnson »

Hi, not sure if this will help, but here is how I got PDF support working in Lambda: https://gist.github.com/bensie/56f51bc3 ... nt-2983885
Post Reply