使用javascript(使用pdf.js)将pdf转换为png数组

Turn pdf into array of png#39;s using javascript (with pdf.js)(使用javascript(使用pdf.js)将pdf转换为png数组)
本文介绍了使用javascript(使用pdf.js)将pdf转换为png数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试开发一个前端代码,要求用户提供一个 pdf,然后在内部(在用户浏览器中)生成一个 png 数组(通过数据到 url),其中数组中的每个条目对应于pdf:

Im trying to develop a frontend code that asks the user to provide a pdf and then internally (in the users browser) produces an array of png's (via data to url) where each entry in the array corresponds to a page in the pdf:

dat[0] = 第 1 页的 png
dat[1] = 第 2 页的 png
...

dat[0] = png of page 1
dat[1] = png of page 2
...

当我测试下面的代码时,页面以某种方式呈现在彼此之上并旋转.

When I test the below code the pages are somehow rendered on top of eachother and rotated.

<script src="http://cdnjs.cloudflare.com/ajax/libs/processing.js/1.4.1/processing-api.min.js"></script><html>
<!--
  Created using jsbin.com
  Source can be edited via http://jsbin.com/pdfjs-helloworld-v2/8598/edit
-->
<body>
  <canvas id="the-canvas" style="border:1px solid black"></canvas>
  <input id='pdf' type='file'/>

  <!-- Use latest PDF.js build from Github -->

  <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.4/jquery.min.js"></script>
  <script src="pdf.js"></script>
  <script src="pdf.worker.js"></script>
  <script type="text/javascript">
    //
    // Asynchronous download PDF as an ArrayBuffer
    //
    dat = [];
    

    var pdf = document.getElementById('pdf');
    pdf.onchange = function(ev) {
      if (file = document.getElementById('pdf').files[0]) {
        fileReader = new FileReader();
        fileReader.onload = function(ev) {
          //console.log(ev);
          PDFJS.getDocument(fileReader.result).then(function getPdfHelloWorld(pdf) {
            //
            // Fetch the first page
            //
            number_of_pages = pdf.numPages;

            for(i = 1; i < number_of_pages+1; ++i) {
              pdf.getPage(i).then(function getPageHelloWorld(page) {

              var scale = 1;
              var viewport = page.getViewport(scale);

              //
              // Prepare canvas using PDF page dimensions
              //
              var canvas = document.getElementById('the-canvas');
              var context = canvas.getContext('2d');
              canvas.height = viewport.height;
              canvas.width = viewport.width;

              //
              // Render PDF page into canvas context
              //
              var renderContext = {
                canvasContext: context,
                viewport: viewport};
              page.render(renderContext).then(function() {
                dat.push(canvas.toDataURL('image/png'));
              });
              });
            }
            //console.log(pdf.numPages);
            //console.log(pdf)

          }, function(error){
            console.log(error);
          });
        };
        fileReader.readAsArrayBuffer(file);
      }
    }

  </script>


<style id="jsbin-css">

</style>
<script>

</script>
</body>
</html>

我只对数组 dat 感兴趣.当我渲染数组中的图像时,我看到了dat[0] = 第 1 页的 png(正确)
dat[1] = 第 1 页的 png 和第 2 页的 png 相互旋转 180
...

Im only interested in the array dat. When I render the images in the array I see that dat[0] = png of page 1 (correct)
dat[1] = png of page 1 and png page 2 rotated 180 on top of each other
...

如何确保在数组的每个条目中正确呈现单个页面?

How do I ensure a correct rendering of single pages in each entry of the array?

推荐答案

尝试在不同的画布上渲染页面.您可以创建一个 canvas 并将其附加到容器中使用

Try rendering the pages on a different canvas. You can create a canvas and append it to the container using

var canvasdiv = document.getElementById('canvas');      
var canvas = document.createElement('canvas');
canvasdiv.appendChild(canvas);

var url = 'https://file-examples-com.github.io/uploads/2017/10/file-sample_150kB.pdf';

var PDFJS = window['pdfjs-dist/build/pdf'];

PDFJS.GlobalWorkerOptions.workerSrc = '//mozilla.github.io/pdf.js/build/pdf.worker.js';

var loadingTask = PDFJS.getDocument(url);

loadingTask.promise.then(function(pdf) {

  var canvasdiv = document.getElementById('canvas');
  var totalPages = pdf.numPages
  var data = [];

  for (let pageNumber = 1; pageNumber <= totalPages; pageNumber++) {
    pdf.getPage(pageNumber).then(function(page) {

      var scale = 1.5;
      var viewport = page.getViewport({ scale: scale });

      var canvas = document.createElement('canvas');
      canvasdiv.appendChild(canvas);

      // Prepare canvas using PDF page dimensions
      var context = canvas.getContext('2d');
      canvas.height = viewport.height;
      canvas.width = viewport.width;

      // Render PDF page into canvas context
      var renderContext = { canvasContext: context, viewport: viewport };

      var renderTask = page.render(renderContext);
      renderTask.promise.then(function() {
        data.push(canvas.toDataURL('image/png'))
        console.log(data.length + ' page(s) loaded in data')
      });
    });
  }

}, function(reason) {
  // PDF loading error
  console.error(reason);
});

canvas {
  border: 1px solid black;
  margin: 5px;
  width: 25%;
}

<script src="//mozilla.github.io/pdf.js/build/pdf.js"></script>

<div id="canvas"></div>

这篇关于使用javascript(使用pdf.js)将pdf转换为png数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Update another component when Formik form changes(当Formik表单更改时更新另一个组件)
Formik validation isSubmitting / isValidating not getting set to true(Formik验证正在提交/isValiating未设置为True)
React Validation Max Range Using Formik(使用Formik的Reaction验证最大范围)
Validation using Yup to check string or number length(使用YUP检查字符串或数字长度的验证)
Updating initialValues prop on Formik Form does not update input value(更新Formik表单上的初始值属性不会更新输入值)
password validation with yup and formik(使用YUP和Formick进行密码验证)