Page 1 of 1

conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-04T07:55:47-07:00
by Roberto
I am new to imagemagick and am trying to convert ".png" and ".jpg" image files to pdf using an aws lambda / node.js function, but the result is a file without the pdf files header.
Can anyone tell me if there is any restriction for this?
I am reading a file from aws S3, saving in /tmp and then executing the command:

im.convert([('/tmp/input-'+ srcKey), ('/tmp/' + dstKey)], function(err, res) {

where srcKey is the original filename (".jpg") and dstKey is the output filename with the ".pdf" extension.

The output file is generated but without the pdf header like: (below the header of the file generated when I execute the command by the command line.)

Code: Select all

%PDF-1.3
1 0 obj
<<
/Pages 2 0 R
/Type /Catalog
>>
endobj
2 0 obj
<<
/Type /Pages
/Kids [ 3 0 R ]
/Count 1
>>
endobj
3 0 obj
<<
...
if I change the file extension to ".jpg" or ".png" I can see the generated file, but not with the ".pdf" extension. Looks like if the imagemagick was not doing the conversion.
Any idea?

Regards

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-15T11:21:31-07:00
by Roberto
Hi Guys.
Has anyone ever converted a ".jpg" or ".png" image file to PDF using imagemagick with AWS lambda?
Using the command line in my computer I can do the conversion normally but via AWS Lambda does not.

Basically I'm reading an image file from AWS S3 and saving in the /tmp. Then I execute the IM convert command.
After the conversion I'm reading the converted file to put in the buffer and save again in the AWS S3:


//Get Image File from S3
console.log('Start GetObject.');
s3.getObject({
Bucket: srcBucket,
Key: srcKey
}).promise()
.then(function (response) {

// put image file to buffer
let buffIn = new Buffer.from(response.Body, 'base64');

// write image file to /tmp
fs.writeFile(('/tmp/input-'+ srcKey), buffIn, function(err) {
// If an error occurred, show it and return
if(err) {
console.log('error:%s',err);
} else {
//Loop through the sizes and perform resizing for each available options
_sizesArray.forEach(function (value, key) {

let dstKey = _sizesArray[key].outFilename;

// transform, and upload to S3 with different FileNames
async.waterfall([
.
.
.
.
im.convert([('/tmp/input-' + srcKey), ('/tmp/' + dstKey)], function(err, res) {

.
.
.
fs.readFile(('/tmp/' + dstKey), function(err, buffOut) {
if(err) {
console.log('error:%s',err);
next(err);
} else {
next(null, response.ContentType, buffOut);
}


Where:
/tmp/input-' + srcKey = Source Image File Name with ".jpg" extension
/tmp/' + dstKey = Destination Image File Name with ".pdf" extension

No errors is generated and the generated Destination file is identical to the source file, just with the extension changed to ".pdf".
I installed ghostscript and put it in the package sent to Lambda but it did not change anything.
Anyone have any idea what the problem might be?

Regards

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-15T12:37:33-07:00
by fmw42
Imagemagick is a raster processor. So note that if you convert a raster image to PDF, it will be a raster image in a vector PDF shell. It will not be converted to vector. For that you need a tool such as potrace.

Note that you might have to modify your policy.xml file to permit reading and writing PDF files.

Sorry, I do not know AWS.

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-15T13:08:48-07:00
by snibgo
Note also that writing to PDF doesn't use Ghostscript. GS is needed only for reading PDF and other formats.


I don't know AWS. I suggest you check the values of srcKey and dstKey.

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-16T06:26:12-07:00
by Roberto
Thanks snibgo and fmw42.
I just strange that I can do size conversions and apply profile files, but conversion to PDF did not work.
I'll do some more testing, thank you.

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-16T09:05:01-07:00
by fmw42
As I mentioned above, check and modify your policy.xml file for permission to use PDF/EPS/PS files.

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-16T10:23:53-07:00
by Roberto
Thanks fmw42.

In the policy.xml file I have the following lines:
</policymap>
<policy domain="module" rights="read|write" pattern="{EPS,PS2,PS3,PS,PDF,XPS}" />
<policy domain="module" rights="read|write" pattern="{GIF,JPEG,PNG,WEBP,JPG`}" />
<policy domain="coder" rights="read|write" pattern="{GIF,JPEG,PNG,WEBP,JPG}" />
<policy domain="coder" rights="read|write" pattern="{EPS,PS2,PS3,PS,PDF,XPS}" />


and using the command identify -list policy we have:

$ identify -list policy

Path: /usr/local/Cellar/imagemagick/7.0.8-53/etc/ImageMagick-7/policy.xml
Policy: Resource
name: list-length
value: 128
Policy: Resource
name: time
value: 120
Policy: Resource
name: throttle
value: 0
Policy: Resource
name: thread
value: 2
Policy: Resource
name: file
value: 768
Policy: Resource
name: disk
value: 1GiB
Policy: Resource
name: map
value: 512MiB
Policy: Resource
name: memory
value: 256MiB
Policy: Resource
name: area
value: 16KP
Policy: Resource
name: height
value: 8KP
Policy: Resource
name: width
value: 8KP
Policy: Resource
name: temporary-path
value: /tmp
Policy: System
name: precision
value: 6
Policy: Filter
rights: None
pattern: *
Policy: Path
rights: None
pattern: @*
Policy: Module
rights: Read Write
pattern: {EPS,PS2,PS3,PS,PDF,XPS}
Policy: Module
rights: Read Write
pattern: {GIF,JPEG,PNG,WEBP,JPG`}
Policy: Coder
rights: Read Write
pattern: {GIF,JPEG,PNG,WEBP,JPG}
Policy: Coder
rights: Read Write
pattern: {EPS,PS2,PS3,PS,PDF,XPS}

Path: [built-in]
Policy: Undefined
rights: None


but using the command to see the content below, that I don't know if is enough.

$ identify -list delegate | grep pdf
MacBook-Pro-de-Roberto:ResizeProcess03 robertosato$ identify -list delegate | grep pdf
doc => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
docx => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
eps<=>pdf "gs' -sstdout=%%stderr -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 '-sDEVICE=pdfwrite' '-sOutputFile=%o' '-f%i"
odt => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
pdf<=>ps "gs' -sstdout=%%stderr -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 '-sDEVICE=ps2write' -sPDFPassword='%a' '-sOutputFile=%o' '-f%i"
pdf<=>eps "gs' -sstdout=%%stderr -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 -sPDFPassword='%a' '-sDEVICE=eps2write' '-sOutputFile=%o' '-f%i"
ppt => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
pptx => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
ps<=>pdf "gs' -sstdout=%%stderr -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 '-sDEVICE=pdfwrite' '-sOutputFile=%o' '-f%i"
xls => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
xlsx => "soffice' --convert-to pdf -outdir `dirname '%i'` '%i' 2> '%u'; /bin/mv '%i.pdf' '%o"
$

With this configuration I can convert image file to PDF in may computes, but not in the AWS Lambda.
Looking for the package created for the AWS Lambda function I don't see this file, so that may be the problem.
I'm going to do some testing to see the result of the "identify -list policy" command in aws lambda.

Thanks a lot.

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-16T10:46:19-07:00
by Roberto
Hi guys,
Just to inform, in the AWS Lambda instance I have the follow:

for "identify -list policy" command:

/var/task\n\nPath: [built-in]\n
Policy: Undefined\n
rights: None \n
Path: /etc/ImageMagick/policy.xml\n
Policy: Coder\n
rights: None \n
pattern: EPHEMERAL\n
Policy: Coder\n
rights: None \n
pattern: HTTPS\n
Policy: Coder\n
rights: None \n
pattern: HTTP\n
Policy: Coder\n
rights: None \n
pattern: URL\n
Policy: Coder\n
rights: None \n
pattern: FTP\n
Policy: Coder\n
rights: None \n
pattern: MVG\n
Policy: Coder\n
rights: None \n
pattern: MSL\n
Policy: Coder\n
rights: None \n
pattern: TEXT\n
Policy: Coder\n
rights: None \n
pattern: LABEL\n
Policy: Path\n
rights: None \n
pattern: @*\n

for "dentify -list delegate | grep pdf" commad":

/var/task\n
eps<=>pdf \"gs\" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 \"-sDEVICE=pdfwrite\" \"-sOutputFile=%o\" \"-f%i\"\n
pdf<=>eps \"gs\" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 \"-sDEVICE=epswrite\" \"-sOutputFile=%o\" \"-f%i\"\n
pdf<=>ps \"gs\" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 \"-sDEVICE=pswrite\" \"-sOutputFile=%o\" \"-f%i\"\n
ps<=>pdf \"gs\" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 \"-sDEVICE=pdfwrite\" \"-sOutputFile=%o\" \"-f%i\"\n"

With this setting I could convert an image file to PDF?

Thanks

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-16T12:05:46-07:00
by fmw42
So is this resolved? If so, what was the solution?

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-16T13:02:08-07:00
by Roberto
Hi fmw42.
No, from what I saw, in the instance where the function runs, the policy.xml file is not enabled to work with PDF files.
I am looking to see if it is possible to somehow change the aws lambda instance policy.xml configuration.

Regards

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-07-16T15:23:19-07:00
by fmw42
You may also have to edit the delegates.xml file to put the full path to Ghostscript in all entries where it says:

command="&quot;gs&quot;

Insert your path to ghostscript

command="&quot;path/gs&quot;

That is a common issue which PHP Imagick, since it does not use the $PATH environment variable for the shell.


Please report back, especially if you resolve it, since this is a common issue I see reported on stackoverflow also.

Re: conversion to pdf using aws lambda / node.js does not work

Posted: 2019-09-30T12:10:48-07:00
by chaddjohnson
Hi, not sure if this will help, but here is how I got PDF support working in Lambda: https://gist.github.com/bensie/56f51bc3 ... nt-2983885