# spider-gsxt **Repository Path**: tikazyq/spider-gsxt ## Basic Information - **Project Name**: spider-gsxt - **Description**: 企业查询爬虫 - **Primary Language**: Python - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 2 - **Forks**: 5 - **Created**: 2017-11-27 - **Last Updated**: 2025-02-01 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # spider-gsxt 企业查询爬虫 ## 环境安装 - Python3.5 - Mysql5.6 - Phantomjs2.0.0 - Chrome 安装python依赖包 ``` pip3 install -r requirements.txt ``` ## 修改配置文件 ``` vi config.py ``` ## 启动服务 ``` python3 app.py ``` ## 调用接口示例 ``` /crawl?name=xxxx ``` `name`为需要查询企业名称 ## 安装Chrome以及ChromeDriver ```bash sudo apt-get install libxss1 libappindicator1 libindicator7 libnss3 wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb sudo dpkg -i google-chrome*.deb ``` 到 http://chromedriver.storage.googleapis.com/index.html 下载并安装最新版本的chromedriver (2.33) ```bash wget -N http://chromedriver.storage.googleapis.com/2.33/chromedriver_linux64.zip unzip chromedriver_linux64.zip chmod +x chromedriver sudo mv -f chromedriver /usr/local/share/chromedriver sudo ln -s /usr/local/share/chromedriver /usr/local/bin/chromedriver sudo ln -s /usr/local/share/chromedriver /usr/bin/chromedriver ``` 安装xvfb ```bash sudo apt-get -y install xvfb gtk2-engines-pixbuf sudo apt-get -y install xfonts-cyrillic xfonts-100dpi xfonts-75dpi xfonts-base xfonts-scalable ``` ## 数据库 数据表名为`companies` 字段映射: ``` # 注册资本(万) register_capital # 法定代表人 representative # 成立日期 establish_date # 经营范围 business_scope # 公司名称 company_name # 住所(公司地址) company_address # 统一社会信用代码 credit_code # 更新时间 update_ts ``` ## 问题 [NSS >= 3.26 is required](http://www.bkjia.com/Linuxjc/1229050.html)