# Markdown-Paper2Code

**Repository Path**: gsepcsj/markdown-paper2-code

## Basic Information

- **Project Name**: Markdown-Paper2Code
- **Description**: Unofficial implementation of the paper "Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning" with the markdown-formatted paper file.
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 1
- **Created**: 2025-06-09
- **Last Updated**: 2025-06-09

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Simpler Paper2Code Markdown Version

## Introduction

This is a unofficial implementation of the paper "Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning" with the markdown-formatted paper content, simpler and easier for understanding and using with the OpenAI compatible API.

The original paper can be found at: https://arxiv.org/abs/2504.17192.

## Requirements
```plain
openai
docling
```

## Usage

### Convert Paper to Markdown
We suggest using the `docling` library to convert the paper to markdown format. The example command is:
> docling <pdf file path> --to md

Then you can obtain the markdown file with the paper content. We also advise you to remove the images from the original paper decreasing the useless tokens.

### Plan Process
using the converted markdown file to obtain the LLM planed json-formatted result. The example command is:
> python plan_process.py --paper_markdown `markdown file path` --base_url `LLM Service API URL` --model `LLM name` --api_key `LLM API Service API Key` --plan_json `LLM planed file path`

The LLM-planed json-formatted file will be saved in the `plan_json` path.

### Analysis Process
using the planed json-formatted file to obtain the detailed logical analysis result for next code generation process. The example command is:
> python analysis_process.py --plan_json `LLM planed file path` --base_url `LLM Service API URL` --model `LLM name` --api_key `LLM API Service API Key` --analysis_json `LLM detailed logical analysis file path`

The LLM-analyzed json-formatted file will be saved in the `analysis_json` path.

### Code Generation Process
using the analyzed json-formatted file to obtain the code generation result. The example command is:
> python coding_process.py --analysis_json  `LLM detailed logical analysis file path` --base_url `LLM Service API URL` --model `LLM name` --api_key `LLM API Service API Key` --save_dir `coding result python file saved directory` --coding_json `LLM code generation file path`

The final generated code files are in the `save_dir`, and the final results will save in the `coding_json` path.

**Notice**: 
* Our implementation is based on the [original repository](https://github.com/going-doer/Paper2Code) of the paper, and there are a slight different of the prompts in our implementation to suit the markdown-formatted paper content. 
* The Paper2Code framework is very helpful to create the initial reference of the papers which do not share the open-source code, or only include `readme.md` containing "release soon" in the Github / Gitee.