# FinSight **Repository Path**: email4reg/FinSight ## Basic Information - **Project Name**: FinSight - **Description**: No description available - **Primary Language**: Unknown - **License**: GPL-3.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-12-19 - **Last Updated**: 2025-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
FinSight: Towards Real-World Financial Deep Research --- *From data to insights, fully automated, multi-modal financial reports.*

[![arXiv](https://img.shields.io/badge/arXiv-2510.16844-b31b1b.svg?style=flat)](https://arxiv.org/abs/2510.16844) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) [![Python](https://img.shields.io/badge/Python-3.10+-3776AB?logo=python&logoColor=white)](https://www.python.org/) ![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg) DeepWiki Document [![AFAC2025 Track 4](https://img.shields.io/badge/πŸ†_AFAC2025-1st_Place_(1/1289)-gold.svg?style=flat)](https://tianchi.aliyun.com/specials/promotion/afac2025)

**FinSight** is a multi-agent research system that automates the entire financial research process β€” from data collection and analysis to generating publication-ready reports with professional charts and deep insights. 🎯 **One ticker, one click, one comprehensive research report.**
*If this project helps you, please ⭐ star & 🍴 fork!*
## πŸŽ₯ Demo https://github.com/user-attachments/assets/41963369-3dd4-4dfd-ad95-ef95cd092ebb

Easy-to-Use UI demo: one ticker ➜ automated research ➜ publish-ready report.

## πŸ“‘ Table of Contents - [✨ Key Features](#-key-features) - [πŸ—ΊοΈ Roadmap](#-roadmap) - [πŸš€ Quick Start](#-quick-start) - [🎨 Result Examples](#-result-examples) - [πŸ—οΈ Architecture](#-architecture) - [πŸ“– Advanced Usage](#-advanced-usage) - [πŸ“Š Evaluation Results](#-evaluation-results) - [πŸ“œ License](#-license) - [πŸ“– Citation](#-citation) - [πŸ™ Acknowledgments](#-acknowledgments) ## ✨ Key Features * **πŸ“Š Professional-Grade Report Generation** One-click to generate 20,000+ word financial reports that rival human experts. Outperforms GPT-5 and Perplexity Deep Research in factual accuracy, analytical depth, and presentation quality. * **πŸ€– Programmable Code Agent (CAVM)** Implementation of the *Code Agent with Variable Memory* architecture. Unlike rigid workflows, agents operate in a unified variable space, executing Python code to manipulate data, tools, and memory dynamically for transparent and reproducible analysis. * **πŸ“ˆ Automated Charting with VLM Feedback** Solves the "ugly AI chart" problem. Built-in visual agents strictly follow professional standards, automatically correcting missing legends, wrong scales, or low information density through visual feedback loops. * **πŸ” Deep Research with Evidence Tracing** No more black-box summaries. Every conclusion is derived from a transparent *Chain-of-Analysis*, with strict citations linking back to original data sources, ensuring high textual faithfulness and verifiable insights. * **⚑ Comprehensive Market Intelligence** A unified interface for multi-source financial data. access real-time stock quotes, financial statements, and macro indicators across A-share and HK markets, powered by a robust Python-based tool ecosystem. --- ## πŸ—ΊοΈ Roadmap FinSight is still under development and there are many issues and room for improvement. We will continue to update. And we also sincerely welcome contributions on this open-source toolkit. - [x] Multi-agent collaborative research workflow (collector β†’ analyzer β†’ report) - [x] VLM-powered chart generation + critique loops for clean visuals - [x] Checkpoint/resume for long-running tasks - [x] Interactive web demo (frontend + backend) - [ ] General-purpose research adaptation beyond finance - [ ] Multi-market support (US, global) - [ ] Plugin system for custom tools and agents --- ## πŸš€ Quick Start ### Prerequisites - Python 3.10+ - Pandoc (for polished DOCX/PDF export) - Node.js (optional, for the web UI) - API keys for your LLM stack (LLM, VLM, Embedding, Search) ### Installation ```bash # Clone the repository git clone https://github.com/RUC-NLPIR/FinSight.git cd FinSight # Install Python dependencies pip install -r requirements.txt ``` Install Pandoc (recommended): ```bash # Linux sudo apt-get install pandoc # macOS brew install pandoc # Windows # Download the latest installer: # https://github.com/jgm/pandoc/releases/latest ``` Build the web UI (optional): ```bash cd demo/frontend npm install npm run build ``` ### Configuration FinSight uses a two-layer configuration: 1) `.env` β€” model endpoints & API keys ```bash cp .env.example .env # fill in DS_MODEL_NAME / API keys / base URLs ``` 2) `my_config.yaml` β€” research target & tasks ```yaml target_name: "Your Company Name" stock_code: "000001" # A-share or HK ticker target_type: "financial_company" # financial_company | macro | industry | general output_dir: "./outputs/my-research" language: "en" # en or zh custom_collect_tasks: - "Balance sheet, income statement, cash flow" - "Stock price data and trading volume" ``` ### Run FinSight **CLI (full pipeline)** ```bash python run_report.py ``` **Web Demo** ```bash # Backend cd demo/backend && python app.py # Frontend cd demo/frontend && npm run dev ``` Open http://localhost:3000 --- ## 🎨 Result Examples
| | | |:---:|:---:| | Core revenue analysis | Multi-dimensional financial data | | **Core revenue analysis** | **Multi-dimensional financial data** | | Publication-grade report | Chart-grounded analysis | | **Publication-grade report** | **Chart-grounded analysis** |
Full sample reports live in `assets/example_reports`. ![Full report preview](/assets/example5.png) --- ## πŸ—οΈ Architecture

FinSight Architecture

FinSight is a multi-stage, memory-centric pipeline: Data Collection β†’ Analysis + VLM chart refinement β†’ Report drafting & polishing β†’ Rendering. Each agent runs in a shared variable space with resumable checkpoints. **Agent roster** | Agent | Purpose | Key inputs | Outputs | Default tools/skills | |-------|---------|------------|---------|----------------------| | πŸ“₯ Data Collector | Route and gather structured/unstructured data | Task, ticker/market, custom tasks | Normalized datasets in memory | DeepSearch Agent; all financial/macro/industry tools | | πŸ” Deep Search Agent | Multi-hop web search + content fetch with source validation | Task, query | Search snippets + crawled pages with citations | Serper/Google search; web page fetcher | | πŸ”¬ Data Analyzer | Code-first analysis, charting, VLM critique | Task, analysis task, collected data | Analysis report, charts + captions | DeepSearch Agent; custom palette injection | | πŸ“ Report Generator | Outline β†’ sections β†’ polish β†’ cover/reference β†’ DOCX/PDF | Task, outlines, analysis/memory | Publication-ready report (MD/DOCX/PDF) | DeepSearch Agent; Pandoc + docx2pdf pipeline | **Tool library (high-level)** | Name | Domain | Type | What it does | |------|--------|------|--------------| | Stock profile | Financial | Data API | Corporate profile (A/HK) | | Shareholding structure | Financial | Data API | Top holders, stakes, direct/indirect flags | | Equity valuation metrics | Financial | Data API | PE/PB/ROE/margins | | Stock candlestick data | Financial | Data API | Daily OHLCV with turnover/ROC | | Balance sheet | Financial | Data API | Pivoted balance sheet (HK/A) | | Income statement | Financial | Data API | Pivoted P&L (HK/A) | | Cash-flow statement | Financial | Data API | Pivoted cash flows (HK/A) | | CSI 300 / HSI / SSE / Nasdaq | Market Index | Data API | Daily index OHLCV | | China macro leverage ratio | Macro | Data API | Household/corporate/gov leverage | | Enterprise commodity price index | Macro/Industry | Data API | Commodity price index & sub-series | | China LPR benchmark rates | Macro | Data API | 1Y/5Y benchmark rates | | Urban surveyed unemployment | Macro | Data API | Unemployment by age/city | | Total social financing increment | Macro | Data API | TSF components since 2015 | | China GDP / CPI / PPI YoY | Macro | Data API | Core macro indicators | | US CPI YoY | Macro | Data API | US inflation series | | China exports/imports YoY, trade balance | Macro | Data API | External sector metrics | | Fiscal revenue / FX loan / Money supply / RRR | Macro | Data API | Monetary & fiscal trackers | | FX and gold reserves | Macro | Data API | Monthly reserves | | National stock trading stats | Macro | Data API | Market-wide trading metrics | | Economic policy uncertainty (CN) | Macro | Data API | EPU index | | Industrial value-added growth | Industry | Data API | Industrial production trends | | Above-scale industrial production YoY | Industry | Data API | Large enterprise production | | Manufacturing PMI (official) | Industry | Data API | PMI time series | | Caixin services PMI | Industry | Data API | Services PMI | | Consumer price / retail price index | Industry | Data API | CPI/RPI monthly | | GDP (monthly stats) | Industry | Data API | GDP-related monthly stats | | Producer price index | Industry | Data API | PPI (ex-factory) | | Consumer confidence index | Industry | Data API | Sentiment series | | Total retail sales of consumer goods | Industry | Data API | Retail sales + YoY/MoM | | Bing web search (requests) | Web | Search | HTML-based Bing search | | Google Search Engine (Serper) | Web | Search | API Google search | | Bocha web search | Web | Search | CN-focused search API | | DuckDuckGo / Sogou search | Web | Search | Alternative HTML searches | | Financial site in-domain search (requests/playwright) | Web | Search | Scoped finance domains | | Bing web search (Playwright) | Web | Search (browser) | Dynamic page search | | Bing image search | Web | Image search | Image/snippet retrieval | | Web page content fetcher | Web | Crawler | HTML/PDF crawl + markdown extraction | --- ## πŸ“– Advanced Usage > **πŸ“š Full Documentation**: See **[docs/ADVANCED_USAGE.md](docs/ADVANCED_USAGE.md)** for comprehensive technical documentation with code examples.
πŸ”‘ API Keys & Model Configuration ### Environment Variables (`.env`) FinSight requires three model types. Create a `.env` file in the project root: ```bash # LLM (Main reasoning, code generation) DS_MODEL_NAME="deepseek-chat" DS_API_KEY="sk-your-key" DS_BASE_URL="https://api.deepseek.com/v1" # VLM (Chart analysis and refinement) VLM_MODEL_NAME="qwen-vl-max" VLM_API_KEY="sk-your-key" VLM_BASE_URL="https://dashscope.aliyuncs.com/compatible-mode/v1" # Embedding (Semantic search) EMBEDDING_MODEL_NAME="text-embedding-v3" EMBEDDING_API_KEY="sk-your-key" EMBEDDING_BASE_URL="https://dashscope.aliyuncs.com/compatible-mode/v1" # Web Search (Optional) SERPER_API_KEY="your-serper-key" # Google Search via Serper BOCHAAI_API_KEY="your-bocha-key" # Bocha Search (Chinese-focused) ``` ### Using Aggregator Endpoints ```bash # Example: OpenRouter DS_MODEL_NAME="openai/gpt-4o" DS_API_KEY="sk-or-xxx" DS_BASE_URL="https://openrouter.ai/api/v1" ``` ### Config Loading Logic The `Config` class loads settings in priority order: 1. `src/config/default_config.yaml` (defaults) 2. `my_config.yaml` (your overrides) 3. Runtime `config_dict` (highest priority) Environment variables are resolved via `${VAR_NAME}` syntax in YAML. **Quick Test**: ```python from src.config import Config config = Config(config_file_path='my_config.yaml') print(config.llm_dict.keys()) # Available models ```
πŸ“ YAML Config File Reference ### Complete `my_config.yaml` Structure ```yaml # ===== Target Configuration ===== target_name: "Company Name" # Research target stock_code: "000001" # Ticker (A-share or HK format) target_type: 'financial_company' # financial_company | macro | industry | general output_dir: "./outputs/my-research" # Output directory language: 'en' # en | zh # ===== Template Paths ===== reference_doc_path: 'src/template/report_template.docx' outline_template_path: 'src/template/company_outline.md' # ===== Custom Tasks (Optional) ===== # If omitted, LLM auto-generates appropriate tasks custom_collect_tasks: - "Financial statements (balance sheet, income, cash flow)" - "Stock price history and trading volume" - "Shareholding structure" custom_analysis_tasks: - "Analyze revenue trends and growth drivers" - "Evaluate profitability metrics (ROE, margins)" - "Compare with industry peers" # ===== Cache/Resume Settings ===== use_collect_data_cache: True use_analysis_cache: True use_report_outline_cache: True use_full_report_cache: True use_post_process_cache: True # ===== LLM Configuration ===== llm_config_list: - model_name: "${DS_MODEL_NAME}" api_key: "${DS_API_KEY}" base_url: "${DS_BASE_URL}" generation_params: temperature: 0.7 max_tokens: 32768 top_p: 0.95 - model_name: "${EMBEDDING_MODEL_NAME}" api_key: "${EMBEDDING_API_KEY}" base_url: "${EMBEDDING_BASE_URL}" - model_name: "${VLM_MODEL_NAME}" api_key: "${VLM_API_KEY}" base_url: "${VLM_BASE_URL}" ``` ### Target Types Explained | Type | Use Case | Default Tools | |------|----------|---------------| | `financial_company` | Listed company research | All financial + market tools | | `macro` | Macroeconomic analysis | Macro indicators + GDP/CPI tools | | `industry` | Industry/sector research | Industry + PMI tools | | `general` | General deep research | Web search only |
✏️ Prompt System & Customization ### Prompt Directory Structure ``` src/agents/ β”œβ”€β”€ data_analyzer/prompts/ β”‚ β”œβ”€β”€ general_prompts.yaml # For general research β”‚ └── financial_prompts.yaml # For financial reports β”œβ”€β”€ report_generator/prompts/ β”‚ β”œβ”€β”€ general_prompts.yaml β”‚ β”œβ”€β”€ financial_company_prompts.yaml β”‚ β”œβ”€β”€ financial_macro_prompts.yaml β”‚ └── financial_industry_prompts.yaml β”œβ”€β”€ data_collector/prompts/ β”‚ └── prompts.yaml └── search_agent/prompts/ └── general_prompts.yaml ``` ### Using the Prompt Loader ```python from src.utils.prompt_loader import get_prompt_loader # Load prompts for an agent loader = get_prompt_loader('data_analyzer', report_type='financial') # Get a specific prompt prompt = loader.get_prompt('data_analysis', current_time="2024-12-01", user_query="Analyze revenue trends", data_info="Available datasets...", target_language="English" ) # List all available prompts print(loader.list_available_prompts()) ``` ### Creating Custom Prompts 1. Create `src/agents/data_analyzer/prompts/my_custom_prompts.yaml`: ```yaml data_analysis: | You are an expert analyst for {industry_name}. ## Task {user_query} ## Available Data {data_info} ## Instructions Use for Python code, for final output. All output must be in {target_language}. ``` 2. Set `target_type: 'my_custom'` in config to load your prompts. ### Key Prompt Variables | Variable | Description | |----------|-------------| | `{current_time}` | Timestamp for time-sensitive analysis | | `{user_query}` | The analysis task | | `{data_info}` | Available datasets catalog | | `{api_descriptions}` | Tool API documentation | | `{target_language}` | Output language (English/Chinese) |
πŸ“‘ Custom Outlines & Report Templates ### Outline Template Configuration Set in `my_config.yaml`: ```yaml outline_template_path: 'src/template/company_outline.md' ``` ### Creating Custom Outlines **For Financial Company Reports**: ```markdown # Executive Summary Key metrics, investment thesis, rating. # Company Overview - Business description and history - Management and governance - Shareholder structure # Industry Analysis - Market size and growth - Competitive landscape # Financial Analysis - Revenue and profitability trends - Balance sheet analysis - Cash flow analysis # Valuation - Comparable company analysis - DCF valuation - Target price # Risks - Key risks and mitigants ``` **For General Research**: ```markdown # Introduction Research objectives and scope. # Background Context and literature review. # Methodology Data sources and approach. # Findings - Finding 1 - Finding 2 - Finding 3 # Discussion Implications and analysis. # Conclusion Summary and recommendations. ``` ### Reference Document (Word Styling) The `reference_doc_path` controls Word output formatting: ```yaml reference_doc_path: 'src/template/report_template.docx' ``` **To customize**: 1. Copy the default template 2. Edit styles in Word (Heading 1/2/3, Normal, Table styles) 3. Update the config path
🎨 Chart Styling & Color Palettes ### Default Color Palette ```python # In src/agents/data_analyzer/data_analyzer.py custom_palette = [ "#8B0000", # deep crimson "#FF2A2A", # bright red "#FF6A4D", # orange-red "#FFDAB9", # pale peach "#FFF5E6", # cream "#FFE4B5", # beige "#A0522D", # sienna "#5C2E1F", # dark brown ] ``` ### Custom Palette via Subclass ```python class MyDataAnalyzer(DataAnalyzer): async def _prepare_executor(self): await super()._prepare_executor() # Corporate blue theme my_palette = [ "#003366", "#0066CC", "#66B2FF", "#CCE5FF", "#E6F2FF" ] self.code_executor.set_variable("custom_palette", my_palette) ``` ### VLM Chart Critique Loop Charts go through iterative refinement: 1. **LLM generates** chart code 2. **Code executes** and saves PNG 3. **VLM evaluates** quality (clarity, labels, aesthetics) 4. **If not "FINISH"**: LLM refines based on VLM feedback 5. **Repeat** up to `max_iterations` (default: 3) Customize in `draw_chart` prompt or adjust iterations: ```python chart_code, chart_name = await self._draw_single_chart( task=..., max_iterations=5 # More refinement cycles ) ```
πŸ› οΈ Adding Custom Tools ### Tool Base Class ```python from src.tools.base import Tool, ToolResult class MyCustomTool(Tool): def __init__(self): super().__init__( name="My Custom Tool", description="Description for LLM to understand usage", parameters=[ {"name": "param1", "type": "str", "description": "First parameter", "required": True}, {"name": "param2", "type": "int", "description": "Optional parameter", "required": False}, ] ) async def api_function(self, param1: str, param2: int = 10): # Fetch or compute data result_data = await self._fetch_data(param1, param2) return [ ToolResult( name=f"Result for {param1}", description="What this data contains", data=result_data, # DataFrame, dict, list, etc. source="Data source URL or description" ) ] ``` ### Auto-Registration Place your tool in the appropriate category folder: ``` src/tools/ β”œβ”€β”€ financial/ β”‚ └── my_stock_tool.py ← Your new tool β”œβ”€β”€ macro/ β”œβ”€β”€ industry/ └── web/ ``` Tools are **automatically registered** on import: ```python from src.tools import list_tools, get_tool_by_name print(list_tools()) # Your tool appears here tool = get_tool_by_name('My Custom Tool')() result = await tool.api_function(param1='test') ``` ### Example: Stock Technical Analysis Tool ```python # src/tools/financial/technical_tool.py import pandas as pd from ..base import Tool, ToolResult class TechnicalAnalysisTool(Tool): def __init__(self): super().__init__( name="Stock technical indicators", description="Calculate SMA, RSI, MACD for a stock", parameters=[ {"name": "stock_code", "type": "str", "description": "Stock ticker", "required": True}, ], ) async def api_function(self, stock_code: str): import efinance as ef df = ef.stock.get_quote_history(stock_code) # Calculate indicators... df['SMA_20'] = df['ζ”Άη›˜'].rolling(window=20).mean() return [ToolResult( name=f"Technical indicators for {stock_code}", description="SMA, RSI indicators", data=df, source="Calculated from exchange data" )] ```
πŸ€– Adding Custom Agents ### Agent Base Class ```python from src.agents.base_agent import BaseAgent class MyCustomAgent(BaseAgent): AGENT_NAME = 'my_custom_agent' AGENT_DESCRIPTION = 'Description for use as a sub-agent' NECESSARY_KEYS = ['task', 'custom_param'] def __init__(self, config, tools=None, use_llm_name="deepseek-chat", enable_code=True, memory=None, agent_id=None): if tools is None: tools = self._get_default_tools() super().__init__(config, tools, use_llm_name, enable_code, memory, agent_id) # Load prompts from src.utils.prompt_loader import get_prompt_loader self.prompt_loader = get_prompt_loader('my_custom_agent') async def _prepare_init_prompt(self, input_data: dict) -> list[dict]: prompt = self.prompt_loader.get_prompt('main_prompt', task=input_data['task'], current_time=self.current_time ) return [{"role": "user", "content": prompt}] # Custom action handlers async def _handle_analyze_action(self, action_content: str): result = await self._perform_analysis(action_content) return {"action": "analyze", "result": result, "continue": True} async def _handle_final_action(self, action_content: str): return {"action": "final", "result": action_content, "continue": False} ``` ### Using Agents as Tools ```python class ParentAgent(BaseAgent): def _set_default_tools(self): self.tools = [ MyCustomAgent(config=self.config, memory=self.memory), DeepSearchAgent(config=self.config, memory=self.memory), ] ```
πŸ’Ύ Checkpoint & Resume System ### How It Works Each agent saves state during execution: ```python await self.save( state={ 'conversation_history': conversation_history, 'current_round': current_round, }, checkpoint_name='latest.pkl' ) ``` ### Checkpoint Locations ``` outputs// β”œβ”€β”€ memory/ β”‚ └── memory.pkl # Global memory state β”œβ”€β”€ agent_working/ β”‚ β”œβ”€β”€ agent_data_collector_xxx/ β”‚ β”‚ └── .cache/latest.pkl β”‚ β”œβ”€β”€ agent_data_analyzer_xxx/ β”‚ β”‚ β”œβ”€β”€ .cache/latest.pkl β”‚ β”‚ β”œβ”€β”€ .cache/charts.pkl β”‚ β”‚ └── images/ β”‚ └── agent_report_generator_xxx/ β”‚ └── .cache/ β”‚ β”œβ”€β”€ outline_latest.pkl β”‚ β”œβ”€β”€ section_0.pkl β”‚ └── report_latest.pkl └── logs/ ``` ### Controlling Resume ```python # Resume from checkpoints (default) asyncio.run(run_report(resume=True)) # Fresh start (ignores checkpoints) asyncio.run(run_report(resume=False)) ``` ### Memory Data Flow ```python from src.memory import Memory memory = Memory(config=config) memory.load() # Load from checkpoint # Access collected data data_list = memory.get_collect_data() # Access analysis results analyses = memory.get_analysis_result() # Semantic search relevant = await memory.retrieve_relevant_data( query="revenue trends", top_k=10, embedding_model="text-embedding-v3" ) ```
πŸ“š Complete Code Examples ### Example 1: Custom Company Analysis ```python import asyncio import os from dotenv import load_dotenv load_dotenv() from src.config import Config from src.memory import Memory from src.agents import DataCollector, DataAnalyzer, ReportGenerator async def analyze_company(): config = Config( config_file_path='my_config.yaml', config_dict={ 'target_name': 'Apple Inc.', 'stock_code': 'AAPL', 'target_type': 'financial_company', 'language': 'en', } ) memory = Memory(config=config) # Run analysis analyzer = DataAnalyzer( config=config, memory=memory, use_llm_name=os.getenv("DS_MODEL_NAME"), use_vlm_name=os.getenv("VLM_MODEL_NAME"), use_embedding_name=os.getenv("EMBEDDING_MODEL_NAME") ) await analyzer.async_run( input_data={ 'task': 'Apple Inc. Investment Research', 'analysis_task': 'Analyze iPhone vs Services revenue' }, max_iterations=15, enable_chart=True ) # Generate report generator = ReportGenerator(config=config, memory=memory, use_llm_name=os.getenv("DS_MODEL_NAME"), use_embedding_name=os.getenv("EMBEDDING_MODEL_NAME")) await generator.async_run( input_data={'task': 'Apple Inc. Investment Research'}, enable_chart=True ) asyncio.run(analyze_company()) ``` ### Example 2: General Deep Research ```python async def deep_research(query: str): config = Config(config_dict={ 'target_name': query, 'target_type': 'general', 'language': 'en', }) memory = Memory(config=config) # Auto-generate analysis tasks tasks = await memory.generate_analyze_tasks( query=query, use_llm_name=os.getenv("DS_MODEL_NAME"), max_num=5 ) for task in tasks: analyzer = DataAnalyzer(config=config, memory=memory, ...) await analyzer.async_run( input_data={'task': query, 'analysis_task': task}, enable_chart=False ) generator = ReportGenerator(config=config, memory=memory, ...) await generator.async_run( input_data={'task': query}, enable_chart=False, add_introduction=False ) asyncio.run(deep_research("Impact of AI on healthcare in 2024")) ```
> **πŸ“š See [docs/advanced_usage.md](docs/advanced_usage.md)** for the complete technical reference with architecture diagrams, troubleshooting, and more examples. --- ## πŸ“Š Evaluation Results We conducted comprehensive evaluations comparing FinSight against leading commercial deep research systems, including **OpenAI Deep Research** and **Gemini-2.5-Pro Deep Research**. Our experiments demonstrate that FinSight significantly outperforms existing solutions, achieving a state-of-the-art overall score of **8.09**.
| | | |:---:|:---:| | Comparison with Deep Research Agents | Generated Report Sample | | **Figure 1:** Performance comparison against SOTA agents | **Figure 2:** Key Fact Recall and Citation Authority |
**Key Findings:** - **SOTA Performance:** FinSight achieves superior overall scores (**8.09**) compared to Gemini-2.5-Pro Deep Research (6.82) and OpenAI Deep Research (6.11). - **Deep Analysis:** Our **Two-Stage Writing Framework** produces reports with higher information richness and analytical depth compared to single-pass LLM searches. - **Expert Visualization:** The **Iterative Vision-Enhanced Mechanism** ensures publication-grade charts, achieving a visualization score of **9.00** (vs. 4.65 for OpenAI). > πŸ“„ For detailed experimental setup and complete results, please refer to our [paper](https://arxiv.org/abs/2510.16844). --- ## πŸ“œ License This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details. --- ## πŸ“– Citation If you find FinSight useful in your research, please cite our paper: ```bibtex @article{jin2025finsight, author = {Jiajie Jin and Yuyao Zhang and Yimeng Xu and Hongjin Qian and Yutao Zhu and Zhicheng Dou}, title = {FinSight: Towards Real-World Financial Deep Research}, journal = {CoRR}, volume = {abs/2510.16844}, year = {2025}, url = {https://doi.org/10.48550/arXiv.2510.16844}, eprinttype = {arXiv}, eprint = {2510.16844}, } ``` --- ## πŸ™ Acknowledgments - [AkShare](https://akshare.akfamily.xyz/) for financial data APIs - [eFinance](https://github.com/mpquant/efinance) for stock data - [Crawl4AI](https://github.com/unclecode/crawl4ai) for web crawling