Skip to content

Conversation

@Srivarshan-T
Copy link

  • Updated package.json to include new dependencies: docx and html-docx-js.
  • Added localization support for the new PDF to Word feature in multiple languages (de, en, es, fr, hi, ja, nl, pt, ru, zh).
  • Implemented the PDF to Word conversion functionality in a new component.
  • Created service logic to handle PDF processing and conversion to Word format.
  • Added tests for the PDF to Word conversion service to ensure functionality and accuracy.

- Updated package.json to include new dependencies: docx and html-docx-js.
- Added localization support for the new PDF to Word feature in multiple languages (de, en, es, fr, hi, ja, nl, pt, ru, zh).
- Implemented the PDF to Word conversion functionality in a new component.
- Created service logic to handle PDF processing and conversion to Word format.
- Added tests for the PDF to Word conversion service to ensure functionality and accuracy.
@Srivarshan-T
Copy link
Author

This PR implements client-side PDF to Word conversion using pdfjs-dist for PDF parsing and docx for Word generation.

✅ Key Enhancements:

  • Preserves line height, spacing, and structure
  • More accurate layout for resume-style PDFs
  • Optimized for multi-page documents

🆚 Note:
A similar PR (#205 ) was opened earlier by another contributor. This PR builds upon that idea but takes a different approach with additional improvements and cleaner formatting. Happy to collaborate or refactor based on reviewer feedback.

Fixes: [#69]

- Removed `html-docx-js` version 0.3.1
- Updated `pdfjs-dist` from version 3.4.120 to 5.3.93
- Downgraded `@types/react` from version 18.3.23 to 18.3.3
@iib0011
Copy link
Owner

iib0011 commented Jul 18, 2025

What about images?

@iib0011
Copy link
Owner

iib0011 commented Jul 18, 2025

Sorry. This doesnt work properly. The layout is lost

@Srivarshan-T
Copy link
Author

Srivarshan-T commented Jul 18, 2025

Which layout can you send the screen shot to me and I can fix it

Copy link
Owner

@iib0011 iib0011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last comment

@iib0011
Copy link
Owner

iib0011 commented Jul 18, 2025

@iib0011
Copy link
Owner

iib0011 commented Jul 18, 2025

This is not easy. Better to use a library

@Srivarshan-T
Copy link
Author

ok I will improve the accuracy especially with layouts and images

@Srivarshan-T
Copy link
Author

@iib0011
After spending several hours researching this, I found that accurate PDF to Word conversion is not feasible using frontend-only solutions. It typically requires either a strong backend setup (e.g., Python scripts or Node.js processing) or third-party APIs — most of which are either limited in functionality or require paid plans for reliable results.

Since implementing this purely in-browser isn't practical without significant compromises, I’m moving forward to work on another issue. Open to suggestions if we plan to support a backend for this feature in the future!

@iib0011 iib0011 closed this Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants