# althttpd
**Repository Path**: ikaros-521/alhttpd
## Basic Information
- **Project Name**: althttpd
- **Description**: AltHTTPD,由SQLite作者编写的服务器。
本仓库仅供学习使用。
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 2
- **Created**: 2021-08-30
- **Last Updated**: 2022-05-24
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
Althttpd 网络服务器
======================
[[For English Readers]](./README.en.md)
Althttpd 是一个简易的网络服务器, 自从2004年开始就运行于该服务器之上。Althttpd 力求简单、安全和低资源
消耗。
在2018年, 的Althttpd实例每天响应大约500,000个请求(大约每秒5-6个请求),每天提供约50GB的内容(大约4.6Mb/S),
部署于一台40美元每月的 [Linode](https://www.linode.com/pricing) 服务器上。
机器上的平均负载值通常保持在 0.1 至 0.2 左右。约 19%
的 HTTP 请求是 CGI 到各种 [Fossil](https://fossil-scm.org/)
源代码存储库。
设计理念
----------------
Althttpd 通常从
[xinetd](https://en.wikipedia.org/wiki/Xinetd) 或
[stunnel4](https://www.stunnel.org/) 启动。每一个传入
的连接都会启动一个独立的进程,并且会只关注该特定的连接。每一个单独的althttpd
进程将会在同一个连接上处理一个或多个HTTP请求。在连接关闭后,althttpd进程也会
退出。
Althttpd也可以独立运行。Althttpd自身可以监听80端口传入的HTTP请求。
然后使用其自身的副本作为分支独立处理每个入站连接。每个连接
仍然使用单独的进程处理。唯一的区别是
连接处理(connection-handler)程序进程 现在由主 althttpd实例 启动
而不是 xinetd 或 stunnel4。
Althttpd没有配置文件。所有的配置都是通过一些命令行参数传入的。
这样的设计有利于保持配置简单,
减轻了通过错误配置的 Web
服务器造成的安全漏洞而引入的后顾之忧。
Althttpd本身并不处理TLS连接。对于HTTPS,althttpd依赖stunnel4去处理TLS协议协商
、解密和加密。
因为每个althttpd进程仅需为一个连接提供服务,所以它是一个单线程的进程。此外,每个连接
仅在单个连接期间存在,这意味着
althttpd 不需要过多担心内存泄漏。
这些设计因素有助于简化 althttpd 的源代码,
有助于安全审计和分析。
源代码相关
-----------
althttpd 的完整源代码包含在单个
不依赖于标准 C 库之外的 C 代码文件。
源代码文件名为 "[althttpd.c](/file/althttpd.c)"。
要构建和安装 althttpd,请运行以下命令:
```shell
gcc -Os -o /usr/bin/althttpd althttpd.c
```
althttpd 源代码有大量注释,具有很高的可读性。
针对客制化与特殊需求进行定制应相对容易。
使用Xinetd启动
------------------
Shown below is the complete text of the /etc/xinetd.d/http file on
sqlite.org that configures althttpd to server unencrypted
HTTP requests on both IPv4 and IPv6.
You can use this as a template to create your own installations.
```text
service http
{
port = 80
flags = IPv4
socket_type = stream
wait = no
user = root
server = /usr/bin/althttpd
server_args = -logfile /logs/http.log -root /home/www -user www-data
bind = 45.33.6.223
}
```
```text
service http
{
port = 80
flags = REUSE IPv6
bind = 2600:3c00::f03c:91ff:fe96:b959
socket_type = stream
wait = no
user = root
server = /usr/bin/althttpd
server_args = -logfile /logs/http.log -root /home/www -user www-data
}
```
The key observation here is that each incoming TCP/IP connection on
port 80 launches a copy of /usr/bin/althttpd with some additional
arguments that amount to the configuration for the webserver.
Notice that althttpd is run as the superuser. This is not required, but if it
is done, then althttpd will move itself into a chroot jail at the root
of the web document hierarchy (/home/www in the example) and then drop
all superuser privileges prior to reading any content off of the wire.
The -user option tells althttpd to become user www-data after entering
the chroot jail.
The -root option tells althttpd where to find the document hierarchy.
In the case of sqlite.org, all content is served from /home/www.
At the top level of this document hierarchy is a bunch of directories
whose names end with ".website". Each such directory is a separate
website. The directory is chosen based on the Host: parameter of the
incoming HTTP request. A partial list of the directories on sqlite.org
is this:
>
3dcanvas_tcl_lang_org.website
3dcanvas_tcl_tk.website
androwish_org.website
canvas3d_tcl_lang_org.website
canvas3d_tcl_tk.website
cvstrac_org.website
default.website
fossil_scm_com.website
fossil_scm_hwaci_com.website
fossil_scm_org.website
system_data_sqlite_org.website
wapp_tcl_lang_org.website
wapp_tcl_tk.website
www2_alt_mail_net.website
www_androwish_org.website
www_cvstrac_org.website
www_fossil_scm_com.website
www_fossil_scm_org.website
www_sqlite_org.website
For each incoming HTTP request, althttpd takes the text of the Host:
parameter in the request header, converts it to lowercase, and changes
all characters other than ASCII alphanumerics into "_". The result
determines which subdirectory to use for content. If nothing matches,
the "default.website" directory is used as a fallback.
For example, if the Host parameter is "www.SQLite.org" then the name is
translated into "www\_sqlite\_org.website" and that is the directory
used to serve content. If the Host parameter is "fossil-scm.org" then
the "fossil\_scm\_org.website" directory is used. Oftentimes, two or
more names refer to the same website. For example, fossil-scm.org,
www.fossil-scm.org, fossil-scm.com, and www.fossil-scm.com are all the
same website. In that case, typically only one of the directories is
a real directory and the others are symbolic links.
On a minimal installation that only hosts a single website, it suffices
to have a single subdirectory named "default.website".
Within the *.website directory, the file to be served is selected by
the HTTP request URI. Files that are marked as executable are run
as CGI. Non-executable files with a name that ends with ".scgi"
and that have content of the form "SCGI hostname port" relay an SCGI
request to hostname:port. All other non-executable files are delivered
as-is.
If the request URI specifies the name of a directory within *.website,
then althttpd appends "/home", "/index.html", and "/index.cgi", in
that order, looking for a match.
If a prefix of a URI matches the name of an executable file then that
file is run as CGI. For as-is content, the request URI must exactly
match the name of the file.
For content delivered as-is, the MIME-type is deduced from the filename
extension using a table that is compiled into althttpd.
使用Stunnel4启动(适用HTTPS)
------------------------------
Althttpd itself does not do any encryption.
To set up an encrypted website using althttpd, the recommended technique
is to use [stunnel4](https://www.stunnel.org/).
On the sqlite.org website, the relevant lines of the
/etc/stunnel/stunnel.conf file are:
```text
cert = /etc/letsencrypt/live/sqlite.org/fullchain.pem
key = /etc/letsencrypt/live/sqlite.org/privkey.pem
\[https\]
accept = :::443
TIMEOUTclose = 0
exec = /usr/bin/althttpd
execargs = /usr/bin/althttpd -logfile /logs/http.log -root /home/www -user www-data -https 1
```
This setup is very similar to the xinetd setup. One key difference is
the "-https 1" option is used to tell althttpd that the connection is
encrypted. This is important so that althttpd will know to set the
HTTPS environment variable for CGI programs.
It is ok to have both xinetd and stunnel4 both configured to
run althttpd, at the same time. In fact, that is the way that the
SQLite.org website works. Requests to go through
xinetd and requests to go through stunnel4.
独立启动
---------------------
在作者的桌面工作站的home文件夹中有一个子目录路径位于`~/www/default.website`。
这个子目录包含一些文件和CGI脚本。Althttpd可以作为这些文件的服务器
,只需运行如下命令:
```shell
althttpd -root ~ -port 8080
```
`-port 8080` 选项告诉althttpd程序运行在独立模式,监听8080端口。
althttpd 的作者只对独立启动模式模式进行了测试。
由于 althttpd 本身不支持 TLS 加密,因此
从stunnel4启动 是生产环境网站的首选。
独立启动(补充)
----------------
如仅想启动一个最小服务器,
可以将单个HTML文件`index.html`放置在文件夹中,
并且在该文件夹中执行如下命令
```shell
althttpd -root . -port 80
```
然后访问或者即可获取
`index.html` 中的内容。
Security Features
-----------------
To defend against mischief, there are restrictions on names of files that
althttpd will serve. Within the request URI, all characters other than
alphanumerics and ",-./:~" are converted into a single "_". Furthermore,
if any path element of the request URI begins with "." or "-" then
althttpd always returns a 404 Not Found error. Thus it is safe to put
auxiliary files (databases or other content used by CGI, for example)
in the document hierarchy as long as the filenames being with "." or "-".
An exception: Though althttpd normally returns 404 Not Found for any
request with a path element beginning with ".", it does allow requests
where the URI begins with "/.well-known/". And file or directory names
below "/.well-known/" are allowed to begin with "." or "-" (but not
with ".."). This exception is necessary to allow LetsEncrypt to validate
ownership of the website.
Basic Authentication
--------------------
If a file named "-auth" appears anywhere within the content hierarchy,
then all sibling files and all files in lower-level directories require
[HTTP basic authentication](https://en.wikipedia.org/wiki/Basic_access_authentication),
as defined by the content of the "-auth" file.
The "-auth" file is plain text and line oriented.
Blank lines and lines that begin with "#" are ignored.
Other lines have meaning as follows:
* http-redirect
The http-redirect line, if present, causes all HTTP requests to
redirect into an HTTPS request. The "-auth" file is read and
processes sequentially, so lines below the "http-redirect" line
are never seen or processed for http requests.
* https-only
The https-only line, if present, means that only HTTPS requests
are allowed. Any HTTP request results in a 404 Not Found error.
The https-only line normally occurs after an http-redirect line.
* realm NAME
A single line of this form establishes the "realm" for basic
authentication. Web browsers will normally display the realm name
as a title on the dialog box that asks for username and password.
* user NAME LOGIN:PASSWORD
There are multiple user lines, one for each valid user. The
LOGIN:PASSWORD argument defines the username and password that
the user must type to gain access to the website. The password
is clear-text - HTTP Basic Authentication is not the most secure
authentication mechanism. Upon successful login, the NAME is
stored in the REMOTE_USER environment variable so that it can be
accessed by CGI scripts. NAME and LOGIN are usually the same,
but can be different.
* anyone
If the "anyone" line is encountered, it means that any request is
allowed, even if there is no username and password provided.
This line is useful in combination with "http-redirect" to cause
all ordinary HTTP requests to redirect to HTTPS without requiring
login credentials.
Basic Authentication Examples
-----------------------------
The website contains a "-auth" file in the
toplevel directory as follows:
>
http-redirect
anyone
That -auth file causes all HTTP requests to be redirected to HTTPS, without
requiring any further login. (Try it: visit http://sqlite.org/ and
verify that you are redirected to https://sqlite.org/.)
There is a "-auth" file at that looks
like this:
>
realm Access To All Fossil Repositories
http-redirect
user drh drh:xxxxxxxxxxxxxxxx
Except, of course, the password is not a row of "X" characters. This
demonstrates the typical use for a -auth file. Access is granted for
a single user to the content in the "private" subdirectory, provided that
the user enters with HTTPS instead of HTTP. The "http-redirect" line
is strongly recommended for all basic authentication since the password
is contained within the request header and can be intercepted and
stolen by bad guys if the request is sent via HTTP.
Log File
--------
If the -logfile option is given on the althttpd command-line, then a single
line is appended to the named file for each HTTP request.
The log file is in the Comma-Separated Value or CSV format specified
by [RFC4180](https://tools.ietf.org/html/rfc4180).
There is a comment in the source code that explains what each of the fields
in this output line mean.
The fact that the log file is CSV makes it easy to import into
SQLite for analysis, using a script like this:
>
CREATE TABLE log(
date TEXT, /* Timestamp */
ip TEXT, /* Source IP address */
url TEXT, /* Request URI */
ref TEXT, /* Referer */
code INT, /* Result code. ex: 200, 404 */
N_IN INT, /* Bytes in request */
N_OUT INT, /* Bytes in reply */
t1 INT, t2 INT, /* Process time (user, system) milliseconds */
t3 INT, t4 INT, /* CGI script time (user, system) milliseconds */
t5 INT, /* Wall-clock time, milliseconds */
nreq INT, /* Sequence number of this request */
agent TEXT, /* User agent */
user TEXT, /* Remote user */
n INT, /* Bytes of url that are in SCRIPT_NAME */
lineno INT /* Source code line that generated log entry */
);
.mode csv
.import httplog.csv log
The filename on the -logfile option may contain time-based characters
that are expanded by [strftime()](https://linux.die.net/man/3/strftime).
Thus, to cause a new logfile to be used for each day, you might use
something like:
>
-logfile /var/logs/althttpd/httplog-%Y%m%d.csv