# PDF2Blocs
## Abstract
*This python script converts pdf file written in french into html file.*
*The conversion consists in organizing the textual content of a pdf file into separate blocks. Each of these blocks will be transformed into an html section: H1, H2, P, FigCaption, Footer, Header.*
*This program uses pdftohtml and pdftotext, two tools of the poppler bookstore (*
*It's run from the command line:*
python /link/to/file.pdf
*The result is written on standard output.*
*The algorithme is described in french into the file of the archive.*
## Résumé
Un script python qui permet de segmenter des documents numériques au format
PDF ayant un contenu textuel en français.
