# althttpd **Repository Path**: ikaros-521/alhttpd ## Basic Information - **Project Name**: althttpd - **Description**: AltHTTPD,由SQLite作者编写的服务器。 本仓库仅供学习使用。 - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 2 - **Created**: 2021-08-30 - **Last Updated**: 2022-05-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README Althttpd 网络服务器 ====================== [[For English Readers]](./README.en.md) Althttpd 是一个简易的网络服务器, 自从2004年开始就运行于该服务器之上。Althttpd 力求简单、安全和低资源 消耗。 在2018年, 的Althttpd实例每天响应大约500,000个请求(大约每秒5-6个请求),每天提供约50GB的内容(大约4.6Mb/S), 部署于一台40美元每月的 [Linode](https://www.linode.com/pricing) 服务器上。 机器上的平均负载值通常保持在 0.1 至 0.2 左右。约 19% 的 HTTP 请求是 CGI 到各种 [Fossil](https://fossil-scm.org/) 源代码存储库。 设计理念 ---------------- Althttpd 通常从 [xinetd](https://en.wikipedia.org/wiki/Xinetd) 或 [stunnel4](https://www.stunnel.org/) 启动。每一个传入 的连接都会启动一个独立的进程,并且会只关注该特定的连接。每一个单独的althttpd 进程将会在同一个连接上处理一个或多个HTTP请求。在连接关闭后,althttpd进程也会 退出。 Althttpd也可以独立运行。Althttpd自身可以监听80端口传入的HTTP请求。 然后使用其自身的副本作为分支独立处理每个入站连接。每个连接 仍然使用单独的进程处理。唯一的区别是 连接处理(connection-handler)程序进程 现在由主 althttpd实例 启动 而不是 xinetd 或 stunnel4。 Althttpd没有配置文件。所有的配置都是通过一些命令行参数传入的。 这样的设计有利于保持配置简单, 减轻了通过错误配置的 Web 服务器造成的安全漏洞而引入的后顾之忧。 Althttpd本身并不处理TLS连接。对于HTTPS,althttpd依赖stunnel4去处理TLS协议协商 、解密和加密。 因为每个althttpd进程仅需为一个连接提供服务,所以它是一个单线程的进程。此外,每个连接 仅在单个连接期间存在,这意味着 althttpd 不需要过多担心内存泄漏。 这些设计因素有助于简化 althttpd 的源代码, 有助于安全审计和分析。 源代码相关 ----------- althttpd 的完整源代码包含在单个 不依赖于标准 C 库之外的 C 代码文件。 源代码文件名为 "[althttpd.c](/file/althttpd.c)"。 要构建和安装 althttpd,请运行以下命令: ```shell gcc -Os -o /usr/bin/althttpd althttpd.c ``` althttpd 源代码有大量注释,具有很高的可读性。 针对客制化与特殊需求进行定制应相对容易。 使用Xinetd启动 ------------------ Shown below is the complete text of the /etc/xinetd.d/http file on sqlite.org that configures althttpd to server unencrypted HTTP requests on both IPv4 and IPv6. You can use this as a template to create your own installations. ```text service http { port = 80 flags = IPv4 socket_type = stream wait = no user = root server = /usr/bin/althttpd server_args = -logfile /logs/http.log -root /home/www -user www-data bind = 45.33.6.223 } ``` ```text service http { port = 80 flags = REUSE IPv6 bind = 2600:3c00::f03c:91ff:fe96:b959 socket_type = stream wait = no user = root server = /usr/bin/althttpd server_args = -logfile /logs/http.log -root /home/www -user www-data } ``` The key observation here is that each incoming TCP/IP connection on port 80 launches a copy of /usr/bin/althttpd with some additional arguments that amount to the configuration for the webserver. Notice that althttpd is run as the superuser. This is not required, but if it is done, then althttpd will move itself into a chroot jail at the root of the web document hierarchy (/home/www in the example) and then drop all superuser privileges prior to reading any content off of the wire. The -user option tells althttpd to become user www-data after entering the chroot jail. The -root option tells althttpd where to find the document hierarchy. In the case of sqlite.org, all content is served from /home/www. At the top level of this document hierarchy is a bunch of directories whose names end with ".website". Each such directory is a separate website. The directory is chosen based on the Host: parameter of the incoming HTTP request. A partial list of the directories on sqlite.org is this: > 3dcanvas_tcl_lang_org.website 3dcanvas_tcl_tk.website androwish_org.website canvas3d_tcl_lang_org.website canvas3d_tcl_tk.website cvstrac_org.website default.website fossil_scm_com.website fossil_scm_hwaci_com.website fossil_scm_org.website system_data_sqlite_org.website wapp_tcl_lang_org.website wapp_tcl_tk.website www2_alt_mail_net.website www_androwish_org.website www_cvstrac_org.website www_fossil_scm_com.website www_fossil_scm_org.website www_sqlite_org.website For each incoming HTTP request, althttpd takes the text of the Host: parameter in the request header, converts it to lowercase, and changes all characters other than ASCII alphanumerics into "_". The result determines which subdirectory to use for content. If nothing matches, the "default.website" directory is used as a fallback. For example, if the Host parameter is "www.SQLite.org" then the name is translated into "www\_sqlite\_org.website" and that is the directory used to serve content. If the Host parameter is "fossil-scm.org" then the "fossil\_scm\_org.website" directory is used. Oftentimes, two or more names refer to the same website. For example, fossil-scm.org, www.fossil-scm.org, fossil-scm.com, and www.fossil-scm.com are all the same website. In that case, typically only one of the directories is a real directory and the others are symbolic links. On a minimal installation that only hosts a single website, it suffices to have a single subdirectory named "default.website". Within the *.website directory, the file to be served is selected by the HTTP request URI. Files that are marked as executable are run as CGI. Non-executable files with a name that ends with ".scgi" and that have content of the form "SCGI hostname port" relay an SCGI request to hostname:port. All other non-executable files are delivered as-is. If the request URI specifies the name of a directory within *.website, then althttpd appends "/home", "/index.html", and "/index.cgi", in that order, looking for a match. If a prefix of a URI matches the name of an executable file then that file is run as CGI. For as-is content, the request URI must exactly match the name of the file. For content delivered as-is, the MIME-type is deduced from the filename extension using a table that is compiled into althttpd. 使用Stunnel4启动(适用HTTPS) ------------------------------ Althttpd itself does not do any encryption. To set up an encrypted website using althttpd, the recommended technique is to use [stunnel4](https://www.stunnel.org/). On the sqlite.org website, the relevant lines of the /etc/stunnel/stunnel.conf file are: ```text cert = /etc/letsencrypt/live/sqlite.org/fullchain.pem key = /etc/letsencrypt/live/sqlite.org/privkey.pem \[https\] accept = :::443 TIMEOUTclose = 0 exec = /usr/bin/althttpd execargs = /usr/bin/althttpd -logfile /logs/http.log -root /home/www -user www-data -https 1 ``` This setup is very similar to the xinetd setup. One key difference is the "-https 1" option is used to tell althttpd that the connection is encrypted. This is important so that althttpd will know to set the HTTPS environment variable for CGI programs. It is ok to have both xinetd and stunnel4 both configured to run althttpd, at the same time. In fact, that is the way that the SQLite.org website works. Requests to go through xinetd and requests to go through stunnel4. 独立启动 --------------------- 在作者的桌面工作站的home文件夹中有一个子目录路径位于`~/www/default.website`。 这个子目录包含一些文件和CGI脚本。Althttpd可以作为这些文件的服务器 ,只需运行如下命令: ```shell althttpd -root ~ -port 8080 ``` `-port 8080` 选项告诉althttpd程序运行在独立模式,监听8080端口。 althttpd 的作者只对独立启动模式模式进行了测试。 由于 althttpd 本身不支持 TLS 加密,因此 从stunnel4启动 是生产环境网站的首选。 独立启动(补充) ---------------- 如仅想启动一个最小服务器, 可以将单个HTML文件`index.html`放置在文件夹中, 并且在该文件夹中执行如下命令 ```shell althttpd -root . -port 80 ``` 然后访问或者即可获取 `index.html` 中的内容。 Security Features ----------------- To defend against mischief, there are restrictions on names of files that althttpd will serve. Within the request URI, all characters other than alphanumerics and ",-./:~" are converted into a single "_". Furthermore, if any path element of the request URI begins with "." or "-" then althttpd always returns a 404 Not Found error. Thus it is safe to put auxiliary files (databases or other content used by CGI, for example) in the document hierarchy as long as the filenames being with "." or "-". An exception: Though althttpd normally returns 404 Not Found for any request with a path element beginning with ".", it does allow requests where the URI begins with "/.well-known/". And file or directory names below "/.well-known/" are allowed to begin with "." or "-" (but not with ".."). This exception is necessary to allow LetsEncrypt to validate ownership of the website. Basic Authentication -------------------- If a file named "-auth" appears anywhere within the content hierarchy, then all sibling files and all files in lower-level directories require [HTTP basic authentication](https://en.wikipedia.org/wiki/Basic_access_authentication), as defined by the content of the "-auth" file. The "-auth" file is plain text and line oriented. Blank lines and lines that begin with "#" are ignored. Other lines have meaning as follows: * http-redirect The http-redirect line, if present, causes all HTTP requests to redirect into an HTTPS request. The "-auth" file is read and processes sequentially, so lines below the "http-redirect" line are never seen or processed for http requests. * https-only The https-only line, if present, means that only HTTPS requests are allowed. Any HTTP request results in a 404 Not Found error. The https-only line normally occurs after an http-redirect line. * realm NAME A single line of this form establishes the "realm" for basic authentication. Web browsers will normally display the realm name as a title on the dialog box that asks for username and password. * user NAME LOGIN:PASSWORD There are multiple user lines, one for each valid user. The LOGIN:PASSWORD argument defines the username and password that the user must type to gain access to the website. The password is clear-text - HTTP Basic Authentication is not the most secure authentication mechanism. Upon successful login, the NAME is stored in the REMOTE_USER environment variable so that it can be accessed by CGI scripts. NAME and LOGIN are usually the same, but can be different. * anyone If the "anyone" line is encountered, it means that any request is allowed, even if there is no username and password provided. This line is useful in combination with "http-redirect" to cause all ordinary HTTP requests to redirect to HTTPS without requiring login credentials. Basic Authentication Examples ----------------------------- The website contains a "-auth" file in the toplevel directory as follows: > http-redirect anyone That -auth file causes all HTTP requests to be redirected to HTTPS, without requiring any further login. (Try it: visit http://sqlite.org/ and verify that you are redirected to https://sqlite.org/.) There is a "-auth" file at that looks like this: > realm Access To All Fossil Repositories http-redirect user drh drh:xxxxxxxxxxxxxxxx Except, of course, the password is not a row of "X" characters. This demonstrates the typical use for a -auth file. Access is granted for a single user to the content in the "private" subdirectory, provided that the user enters with HTTPS instead of HTTP. The "http-redirect" line is strongly recommended for all basic authentication since the password is contained within the request header and can be intercepted and stolen by bad guys if the request is sent via HTTP. Log File -------- If the -logfile option is given on the althttpd command-line, then a single line is appended to the named file for each HTTP request. The log file is in the Comma-Separated Value or CSV format specified by [RFC4180](https://tools.ietf.org/html/rfc4180). There is a comment in the source code that explains what each of the fields in this output line mean. The fact that the log file is CSV makes it easy to import into SQLite for analysis, using a script like this: > CREATE TABLE log( date TEXT, /* Timestamp */ ip TEXT, /* Source IP address */ url TEXT, /* Request URI */ ref TEXT, /* Referer */ code INT, /* Result code. ex: 200, 404 */ N_IN INT, /* Bytes in request */ N_OUT INT, /* Bytes in reply */ t1 INT, t2 INT, /* Process time (user, system) milliseconds */ t3 INT, t4 INT, /* CGI script time (user, system) milliseconds */ t5 INT, /* Wall-clock time, milliseconds */ nreq INT, /* Sequence number of this request */ agent TEXT, /* User agent */ user TEXT, /* Remote user */ n INT, /* Bytes of url that are in SCRIPT_NAME */ lineno INT /* Source code line that generated log entry */ ); .mode csv .import httplog.csv log The filename on the -logfile option may contain time-based characters that are expanded by [strftime()](https://linux.die.net/man/3/strftime). Thus, to cause a new logfile to be used for each day, you might use something like: > -logfile /var/logs/althttpd/httplog-%Y%m%d.csv