# html2text **Repository Path**: yunqi-zwt/html2text ## Basic Information - **Project Name**: html2text - **Description**: Convert HTML to Markdown-formatted text. - **Primary Language**: Python - **License**: GPL-3.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-11-22 - **Last Updated**: 2022-01-21 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # [html2text](http://www.aaronsw.com/2002/html2text/) html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to be valid Markdown (a text-to-HTML format). Usage: `html2text.py [(filename|url) [encoding]]` Options: --version show program's version number and exit -h, --help show this help message and exit --ignore-links don't include any formatting for links --ignore-images don't include any formatting for images -g, --google-doc convert an html-exported Google Document -d, --dash-unordered-list use a dash rather than a star for unordered list items -b BODY_WIDTH, --body-width=BODY_WIDTH number of characters per output line, 0 for no wrap -i LIST_INDENT, --google-list-indent=LIST_INDENT number of pixels Google indents nested lists -s, --hide-strikethrough hide strike-through text. only relevent when -g is specified as well Or you can use it from within Python: import html2text print html2text.html2text("
Hello, world.
") Or with some configuration options: import html2text h = html2text.HTML2Text() h.ignore_links = True print h.handle("Hello, world!") _Originally written by Aaron Swartz. This code is distributed under the GPLv3._ ## How to do a release 1. Update the version in `html2text.py` 2. Update the version in `setup.py` 3. Run `python setup.py sdist upload` ## How to run unit tests cd test/ python run_tests.py [](http://travis-ci.org/aaronsw/html2text)