# CycleResearcher **Repository Path**: ktwu/CycleResearcher ## Basic Information - **Project Name**: CycleResearcher - **Description**: CycleResearcher: Improving Automated Research via Automated Review - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-12-16 - **Last Updated**: 2025-12-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # AI-powered Research and Review Agents [ICLR 2025 / ACL 2025]
## 🚀 Getting Started
### Installation
```bash
pip install ai_researcher
```
### Using CycleResearcher
```python
# Import necessary libraries
from ai_researcher import CycleResearcher
from ai_researcher.utils import print_paper_summary
# Initialize CycleResearcher with the default 12B model
researcher = CycleResearcher(model_size="12B")
# Load references from BibTeX file
with open('cycleresearcher_references.bib', 'r') as f:
references_content = f.read()
# Generate a paper with specific references
generated_papers = researcher.generate_paper(
topic = "AI Researcher",
references = references_content,
n = 1 # Generate a single paper
)
# Print summary of generated paper
print_paper_summary(generated_papers[0])
```
### Using CycleReviewer
```python
# Import necessary libraries
from ai_researcher import CycleReviewer
# Initialize CycleReviewer with the default 8B model
reviewer = CycleReviewer(model_size="8B")
# Review a paper (assuming paper_text contains the paper content)
review_results = reviewer.evaluate(paper_text)
# Print review results
print(f"Average score: {review_results[0]['avg_rating']}")
print(f"Decision: {review_results[0]['paper_decision']}")
```
### Using DeepReviewer
```python
# Import necessary libraries
from ai_researcher import DeepReviewer
# Initialize DeepReviewer with 14B model
deep_reviewer = DeepReviewer(model_size="14B")
# Review a paper with multiple simulated reviewers in Standard Mode
review_results = deep_reviewer.evaluate(
paper_text,
mode="Standard Mode", # Options: "Fast Mode", "Standard Mode", "Best Mode"
reviewer_num=4 # Simulate 4 different reviewers
)
# Print review results
for i, review in enumerate(review_results[0]['reviews']):
print(f"Reviewer {i+1} Rating: {review.get('rating', 'N/A')}")
print(f"Reviewer {i+1} Summary: {review.get('summary', 'N/A')[:100]}...")
```
#### Launching DeepReviewer Best Mode
##### Using OpenScholar
OpenScholar is a retrieval-augmented generation-based academic research question-answering system. For detailed usage instructions, please refer to the [OpenScholar directory](./OpenScholar/).
##### Quick Start Guide for OpenScholar
1. **Apply for Semantic Scholar API Key**: Visit [Semantic Scholar API](https://www.semanticscholar.org/product/api)
2. **Start Model Services**:
```bash
# For Linux/Mac users
cd OpenScholar
chmod +x start_models.sh
./start_models.sh
```
3. **Start API Service**:
```bash
python openscholar_api.py \
--s2_api_key YOUR_SEMANTIC_SCHOLAR_API_KEY \
--reranker_path OpenSciLM/OpenScholar_Reranker
```
4. **Using the API**:
```python
import requests
# Send questions to OpenScholar API
response = requests.post("http://localhost:38015/batch_ask", json={
"questions": ["How do retrieval-augmented LMs perform in knowledge-intensive tasks?"]
})
result = response.json()
print("OpenScholar Answer:", result["results"][0]["output"])
```
#### Best Mode
DeepReviewer's Best Mode provides the most comprehensive review experience, including background knowledge search, multi-reviewer simulation, and self-verification:
```python
# Use Best Mode for in-depth review
review_results = deep_reviewer.evaluate(
paper_text,
mode="Best Mode", # Most comprehensive review mode
reviewer_num=6, # Simulate 6 different reviewers
enable_search=True, # Enable background knowledge search
self_verification=True # Enable self-verification
)
```
## 📊 Model Evaluation
CycleResearcher-12B achieves an average score of 5.36, approaching the 5.69 average for conference-accepted papers and surpassing AI Scientist's score of 4.31.
CycleReviewer outperforms both proprietary systems and human experts with a 48.77% reduction in Proxy MSE and a 26.89% reduction in Proxy MAE compared to human reviewers. With a decision accuracy of 74.24%, our model demonstrates a significant lead over other closed-source systems.
DeepReviewer provides multi-perspective simulation with self-verification, enabling more comprehensive and balanced feedback. It offers three distinct review modes: Fast Mode, Standard Mode, and Best Mode to accommodate different use cases.
| Model Name | Pre-training Language Model | HF Link |
|---|---|---|
| CycleResearcher-ML-12B | Mistral-Nemo-Instruct-2407 | 🤗 link |
| CycleResearcher-ML-72B | Qwen2.5-72B-Instruct | 🤗 link |
| CycleResearcher-ML-123B | Mistral-Large-2 | 🤗 link |
| Model Name | Pre-training Language Model | HF Link |
|---|---|---|
| CycleReviewer-ML-Llama3.1-8B | Llama3.1-8B-Instruct | 🤗 link |
| CycleReviewer-ML-Llama3.1-70B | Llama3.1-70B-Instruct | 🤗 link |
| CycleReviewer-ML-Pro-123B | Mistral-Large-2 | 🤗 link |
| Model Name | Parameters | HF Link |
|---|---|---|
| DeepReviewer-7B | 7B | 🤗 link |
| DeepReviewer-14B | 14B | 🤗 link |
| Dataset Name | Train Data | Test Data | Description | HF Link |
|---|---|---|---|---|
| Review-5K | 4,189 | 781 | Peer review dataset for CycleReviewer training | 🤗 link |
| Research-14K | 12,696 | 802 | Research paper dataset for CycleResearcher training | 🤗 link |
| DeepReview-13K | 13,378 | 1,286 | Multi-perspective review dataset for DeepReviewer training | 🤗 link |
Quick review generation for rapid feedback. Provides essential evaluation without multi-reviewer simulation.
Default mode that simulates multiple reviewers and includes self-verification to ensure reliable assessments.
Most comprehensive mode with background knowledge search, multi-reviewer simulation, and self-verification for in-depth analysis.