From 91da68805060056d148cb5376c1c4a7a758b5197 Mon Sep 17 00:00:00 2001 From: Wang Xin Date: Tue, 18 Nov 2025 19:33:40 +0800 Subject: [PATCH 1/3] doc: update docs Signed-off-by: Wang Xin --- doc/LICENSE | 427 ++++++++++++++++++++++ doc/README.md | 49 +++ doc/RELEASE-NOTES.md | 6 + doc/figure/UB-remote-memory-access-v0.png | Bin 0 -> 26396 bytes doc/libobmm.md | 251 +++++++++++++ doc/obmm.md | 106 ++++++ doc/obmm_export.md | 214 +++++++++++ doc/obmm_import.md | 175 +++++++++ doc/obmm_preimport.md | 166 +++++++++ doc/obmm_preimport_sysfs.md | 26 ++ doc/obmm_query.md | 59 +++ doc/obmm_set_ownership.md | 75 ++++ doc/obmm_shmdev.md | 130 +++++++ doc/obmm_shmdev_sysfs.md | 131 +++++++ doc/obmm_unexport.md | 52 +++ doc/obmm_unimport.md | 54 +++ doc/obmm_unpreimport.md | 73 ++++ 17 files changed, 1994 insertions(+) create mode 100644 doc/LICENSE create mode 100644 doc/README.md create mode 100644 doc/RELEASE-NOTES.md create mode 100644 doc/figure/UB-remote-memory-access-v0.png create mode 100644 doc/libobmm.md create mode 100644 doc/obmm.md create mode 100644 doc/obmm_export.md create mode 100644 doc/obmm_import.md create mode 100644 doc/obmm_preimport.md create mode 100644 doc/obmm_preimport_sysfs.md create mode 100644 doc/obmm_query.md create mode 100644 doc/obmm_set_ownership.md create mode 100644 doc/obmm_shmdev.md create mode 100644 doc/obmm_shmdev_sysfs.md create mode 100644 doc/obmm_unexport.md create mode 100644 doc/obmm_unimport.md create mode 100644 doc/obmm_unpreimport.md diff --git a/doc/LICENSE b/doc/LICENSE new file mode 100644 index 0000000..33bec29 --- /dev/null +++ b/doc/LICENSE @@ -0,0 +1,427 @@ +Attribution-ShareAlike 4.0 International + +======================================================================= + +Creative Commons Corporation ("Creative Commons") is not a law firm and +does not provide legal services or legal advice. Distribution of +Creative Commons public licenses does not create a lawyer-client or +other relationship. Creative Commons makes its licenses and related +information available on an "as-is" basis. Creative Commons gives no +warranties regarding its licenses, any material licensed under their +terms and conditions, or any related information. Creative Commons +disclaims all liability for damages resulting from their use to the +fullest extent possible. + +Using Creative Commons Public Licenses + +Creative Commons public licenses provide a standard set of terms and +conditions that creators and other rights holders may use to share +original works of authorship and other material subject to copyright +and certain other rights specified in the public license below. The +following considerations are for informational purposes only, are not +exhaustive, and do not form part of our licenses. + + Considerations for licensors: Our public licenses are + intended for use by those authorized to give the public + permission to use material in ways otherwise restricted by + copyright and certain other rights. Our licenses are + irrevocable. Licensors should read and understand the terms + and conditions of the license they choose before applying it. + Licensors should also secure all rights necessary before + applying our licenses so that the public can reuse the + material as expected. Licensors should clearly mark any + material not subject to the license. This includes other CC- + licensed material, or material used under an exception or + limitation to copyright. More considerations for licensors: + wiki.creativecommons.org/Considerations_for_licensors + + Considerations for the public: By using one of our public + licenses, a licensor grants the public permission to use the + licensed material under specified terms and conditions. If + the licensor's permission is not necessary for any reason--for + example, because of any applicable exception or limitation to + copyright--then that use is not regulated by the license. Our + licenses grant only permissions under copyright and certain + other rights that a licensor has authority to grant. Use of + the licensed material may still be restricted for other + reasons, including because others have copyright or other + rights in the material. A licensor may make special requests, + such as asking that all changes be marked or described. + Although not required by our licenses, you are encouraged to + respect those requests where reasonable. More_considerations + for the public: + wiki.creativecommons.org/Considerations_for_licensees + +======================================================================= + +Creative Commons Attribution-ShareAlike 4.0 International Public +License + +By exercising the Licensed Rights (defined below), You accept and agree +to be bound by the terms and conditions of this Creative Commons +Attribution-ShareAlike 4.0 International Public License ("Public +License"). To the extent this Public License may be interpreted as a +contract, You are granted the Licensed Rights in consideration of Your +acceptance of these terms and conditions, and the Licensor grants You +such rights in consideration of benefits the Licensor receives from +making the Licensed Material available under these terms and +conditions. + + +Section 1 -- Definitions. + + a. Adapted Material means material subject to Copyright and Similar + Rights that is derived from or based upon the Licensed Material + and in which the Licensed Material is translated, altered, + arranged, transformed, or otherwise modified in a manner requiring + permission under the Copyright and Similar Rights held by the + Licensor. For purposes of this Public License, where the Licensed + Material is a musical work, performance, or sound recording, + Adapted Material is always produced where the Licensed Material is + synched in timed relation with a moving image. + + b. Adapter's License means the license You apply to Your Copyright + and Similar Rights in Your contributions to Adapted Material in + accordance with the terms and conditions of this Public License. + + c. BY-SA Compatible License means a license listed at + creativecommons.org/compatiblelicenses, approved by Creative + Commons as essentially the equivalent of this Public License. + + d. Copyright and Similar Rights means copyright and/or similar rights + closely related to copyright including, without limitation, + performance, broadcast, sound recording, and Sui Generis Database + Rights, without regard to how the rights are labeled or + categorized. For purposes of this Public License, the rights + specified in Section 2(b)(1)-(2) are not Copyright and Similar + Rights. + + e. Effective Technological Measures means those measures that, in the + absence of proper authority, may not be circumvented under laws + fulfilling obligations under Article 11 of the WIPO Copyright + Treaty adopted on December 20, 1996, and/or similar international + agreements. + + f. Exceptions and Limitations means fair use, fair dealing, and/or + any other exception or limitation to Copyright and Similar Rights + that applies to Your use of the Licensed Material. + + g. License Elements means the license attributes listed in the name + of a Creative Commons Public License. The License Elements of this + Public License are Attribution and ShareAlike. + + h. Licensed Material means the artistic or literary work, database, + or other material to which the Licensor applied this Public + License. + + i. Licensed Rights means the rights granted to You subject to the + terms and conditions of this Public License, which are limited to + all Copyright and Similar Rights that apply to Your use of the + Licensed Material and that the Licensor has authority to license. + + j. Licensor means the individual(s) or entity(ies) granting rights + under this Public License. + + k. Share means to provide material to the public by any means or + process that requires permission under the Licensed Rights, such + as reproduction, public display, public performance, distribution, + dissemination, communication, or importation, and to make material + available to the public including in ways that members of the + public may access the material from a place and at a time + individually chosen by them. + + l. Sui Generis Database Rights means rights other than copyright + resulting from Directive 96/9/EC of the European Parliament and of + the Council of 11 March 1996 on the legal protection of databases, + as amended and/or succeeded, as well as other essentially + equivalent rights anywhere in the world. + + m. You means the individual or entity exercising the Licensed Rights + under this Public License. Your has a corresponding meaning. + + +Section 2 -- Scope. + + a. License grant. + + 1. Subject to the terms and conditions of this Public License, + the Licensor hereby grants You a worldwide, royalty-free, + non-sublicensable, non-exclusive, irrevocable license to + exercise the Licensed Rights in the Licensed Material to: + + a. reproduce and Share the Licensed Material, in whole or + in part; and + + b. produce, reproduce, and Share Adapted Material. + + 2. Exceptions and Limitations. For the avoidance of doubt, where + Exceptions and Limitations apply to Your use, this Public + License does not apply, and You do not need to comply with + its terms and conditions. + + 3. Term. The term of this Public License is specified in Section + 6(a). + + 4. Media and formats; technical modifications allowed. The + Licensor authorizes You to exercise the Licensed Rights in + all media and formats whether now known or hereafter created, + and to make technical modifications necessary to do so. The + Licensor waives and/or agrees not to assert any right or + authority to forbid You from making technical modifications + necessary to exercise the Licensed Rights, including + technical modifications necessary to circumvent Effective + Technological Measures. For purposes of this Public License, + simply making modifications authorized by this Section 2(a) + (4) never produces Adapted Material. + + 5. Downstream recipients. + + a. Offer from the Licensor -- Licensed Material. Every + recipient of the Licensed Material automatically + receives an offer from the Licensor to exercise the + Licensed Rights under the terms and conditions of this + Public License. + + b. Additional offer from the Licensor -- Adapted Material. + Every recipient of Adapted Material from You + automatically receives an offer from the Licensor to + exercise the Licensed Rights in the Adapted Material + under the conditions of the Adapter's License You apply. + + c. No downstream restrictions. You may not offer or impose + any additional or different terms or conditions on, or + apply any Effective Technological Measures to, the + Licensed Material if doing so restricts exercise of the + Licensed Rights by any recipient of the Licensed + Material. + + 6. No endorsement. Nothing in this Public License constitutes or + may be construed as permission to assert or imply that You + are, or that Your use of the Licensed Material is, connected + with, or sponsored, endorsed, or granted official status by, + the Licensor or others designated to receive attribution as + provided in Section 3(a)(1)(A)(i). + + b. Other rights. + + 1. Moral rights, such as the right of integrity, are not + licensed under this Public License, nor are publicity, + privacy, and/or other similar personality rights; however, to + the extent possible, the Licensor waives and/or agrees not to + assert any such rights held by the Licensor to the limited + extent necessary to allow You to exercise the Licensed + Rights, but not otherwise. + + 2. Patent and trademark rights are not licensed under this + Public License. + + 3. To the extent possible, the Licensor waives any right to + collect royalties from You for the exercise of the Licensed + Rights, whether directly or through a collecting society + under any voluntary or waivable statutory or compulsory + licensing scheme. In all other cases the Licensor expressly + reserves any right to collect such royalties. + + +Section 3 -- License Conditions. + +Your exercise of the Licensed Rights is expressly made subject to the +following conditions. + + a. Attribution. + + 1. If You Share the Licensed Material (including in modified + form), You must: + + a. retain the following if it is supplied by the Licensor + with the Licensed Material: + + i. identification of the creator(s) of the Licensed + Material and any others designated to receive + attribution, in any reasonable manner requested by + the Licensor (including by pseudonym if + designated); + + ii. a copyright notice; + + iii. a notice that refers to this Public License; + + iv. a notice that refers to the disclaimer of + warranties; + + v. a URI or hyperlink to the Licensed Material to the + extent reasonably practicable; + + b. indicate if You modified the Licensed Material and + retain an indication of any previous modifications; and + + c. indicate the Licensed Material is licensed under this + Public License, and include the text of, or the URI or + hyperlink to, this Public License. + + 2. You may satisfy the conditions in Section 3(a)(1) in any + reasonable manner based on the medium, means, and context in + which You Share the Licensed Material. For example, it may be + reasonable to satisfy the conditions by providing a URI or + hyperlink to a resource that includes the required + information. + + 3. If requested by the Licensor, You must remove any of the + information required by Section 3(a)(1)(A) to the extent + reasonably practicable. + + b. ShareAlike. + + In addition to the conditions in Section 3(a), if You Share + Adapted Material You produce, the following conditions also apply. + + 1. The Adapter's License You apply must be a Creative Commons + license with the same License Elements, this version or + later, or a BY-SA Compatible License. + + 2. You must include the text of, or the URI or hyperlink to, the + Adapter's License You apply. You may satisfy this condition + in any reasonable manner based on the medium, means, and + context in which You Share Adapted Material. + + 3. You may not offer or impose any additional or different terms + or conditions on, or apply any Effective Technological + Measures to, Adapted Material that restrict exercise of the + rights granted under the Adapter's License You apply. + + +Section 4 -- Sui Generis Database Rights. + +Where the Licensed Rights include Sui Generis Database Rights that +apply to Your use of the Licensed Material: + + a. for the avoidance of doubt, Section 2(a)(1) grants You the right + to extract, reuse, reproduce, and Share all or a substantial + portion of the contents of the database; + + b. if You include all or a substantial portion of the database + contents in a database in which You have Sui Generis Database + Rights, then the database in which You have Sui Generis Database + Rights (but not its individual contents) is Adapted Material, + + including for purposes of Section 3(b); and + c. You must comply with the conditions in Section 3(a) if You Share + all or a substantial portion of the contents of the database. + +For the avoidance of doubt, this Section 4 supplements and does not +replace Your obligations under this Public License where the Licensed +Rights include other Copyright and Similar Rights. + + +Section 5 -- Disclaimer of Warranties and Limitation of Liability. + + a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE + EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS + AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF + ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS, + IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION, + WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR + PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS, + ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT + KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT + ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU. + + b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE + TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION, + NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT, + INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES, + COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR + USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN + ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR + DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR + IN PART, THIS LIMITATION MAY NOT APPLY TO YOU. + + c. The disclaimer of warranties and limitation of liability provided + above shall be interpreted in a manner that, to the extent + possible, most closely approximates an absolute disclaimer and + waiver of all liability. + + +Section 6 -- Term and Termination. + + a. This Public License applies for the term of the Copyright and + Similar Rights licensed here. However, if You fail to comply with + this Public License, then Your rights under this Public License + terminate automatically. + + b. Where Your right to use the Licensed Material has terminated under + Section 6(a), it reinstates: + + 1. automatically as of the date the violation is cured, provided + it is cured within 30 days of Your discovery of the + violation; or + + 2. upon express reinstatement by the Licensor. + + For the avoidance of doubt, this Section 6(b) does not affect any + right the Licensor may have to seek remedies for Your violations + of this Public License. + + c. For the avoidance of doubt, the Licensor may also offer the + Licensed Material under separate terms or conditions or stop + distributing the Licensed Material at any time; however, doing so + will not terminate this Public License. + + d. Sections 1, 5, 6, 7, and 8 survive termination of this Public + License. + + +Section 7 -- Other Terms and Conditions. + + a. The Licensor shall not be bound by any additional or different + terms or conditions communicated by You unless expressly agreed. + + b. Any arrangements, understandings, or agreements regarding the + Licensed Material not stated herein are separate from and + independent of the terms and conditions of this Public License. + + +Section 8 -- Interpretation. + + a. For the avoidance of doubt, this Public License does not, and + shall not be interpreted to, reduce, limit, restrict, or impose + conditions on any use of the Licensed Material that could lawfully + be made without permission under this Public License. + + b. To the extent possible, if any provision of this Public License is + deemed unenforceable, it shall be automatically reformed to the + minimum extent necessary to make it enforceable. If the provision + cannot be reformed, it shall be severed from this Public License + without affecting the enforceability of the remaining terms and + conditions. + + c. No term or condition of this Public License will be waived and no + failure to comply consented to unless expressly agreed to by the + Licensor. + + d. Nothing in this Public License constitutes or may be interpreted + as a limitation upon, or waiver of, any privileges and immunities + that apply to the Licensor or You, including from the legal + processes of any jurisdiction or authority. + + +======================================================================= + +Creative Commons is not a party to its public +licenses. Notwithstanding, Creative Commons may elect to apply one of +its public licenses to material it publishes and in those instances +will be considered the “Licensor.” The text of the Creative Commons +public licenses is dedicated to the public domain under the CC0 Public +Domain Dedication. Except for the limited purpose of indicating that +material is shared under a Creative Commons public license or as +otherwise permitted by the Creative Commons policies published at +creativecommons.org/policies, Creative Commons does not authorize the +use of the trademark "Creative Commons" or any other trademark or logo +of Creative Commons without its prior written consent including, +without limitation, in connection with any unauthorized modifications +to any of its public licenses or any other arrangements, +understandings, or agreements concerning use of licensed material. For +the avoidance of doubt, this paragraph does not form part of the +public licenses. + +Creative Commons may be contacted at creativecommons.org. \ No newline at end of file diff --git a/doc/README.md b/doc/README.md new file mode 100644 index 0000000..79d9bb9 --- /dev/null +++ b/doc/README.md @@ -0,0 +1,49 @@ +# OBMM 文档 + +本目录存放 OBMM 文档,和主线代码保持同步。 + +综合考虑易用性和社区规范一致性之后,使用中文markdown管理,分成三部分组织: + +1. 用户态库接口文档,对应 UNIX man pages section 3 +2. 设备文档,对应 UNIX man pages section 4 +3. sysfs文档,对应 UNIX man pages section 5 + +非标准的杂项文档请移步 OBMM wiki。 + + + +用户态库接口文档 + +| 文档 | 内容 | +| --------------------- | ------------------------------------------------------------ | +| libobmm.md | libobmm 接口总览、核心数据结构说明、使用模型说明、粒度说明 | +| obmm_export.md | `obmm_export()`接口说明 | +| obmm_export.md | `obmm_unexport()` 接口说明 | +| obmm_import.md | `obmm_import()` 接口说明 | +| obmm_unimport.md | `obmm_unimport()` 接口说明 | +| obmm_preimport.md | `obmm_preimport()` 接口说明 | +| obmm_unpreimport.md | `obmm_unpreimport()` 接口说明 | +| obmm_set_ownership.md | `obmm_set_ownership()` 接口说明 | +| obmm_query.md | `obmm_query_memid_by_pa()`, `obmm_query_pa_by_memid()` 接口说明 | + + + +设备文档 + +| 文档 | 内容 | +| -------------- | ------------------------------------- | +| obmm.md | obmm.ko 参数说明与 /dev/obmm 设备说明 | +| obmm_shmdev.md | obmm_shmdev\${mem_id} 设备使用说明 | + + + +sysfs 文档 + +| 文档 | 内容 | +| ----------------------- | --------------------------------------------------------- | +| obmm_shmdev_sysfs.md | /sys/devices/obmm/obmm_shmdev\${mem_id}/ 维测目录内容说明 | +| obmm_preimport_sysfs.md | /proc/obmm/preimport_info 维测文件格式说明 | + + + +目前各文档已初步成型。errno、编程 demo 持续补充中。 diff --git a/doc/RELEASE-NOTES.md b/doc/RELEASE-NOTES.md new file mode 100644 index 0000000..9809781 --- /dev/null +++ b/doc/RELEASE-NOTES.md @@ -0,0 +1,6 @@ +1.0.0 +Initial version +新特性:memlink balloon基础功能 +api变更:不涉及 +bugfix:不涉及 +CVE漏洞:不涉及 \ No newline at end of file diff --git a/doc/figure/UB-remote-memory-access-v0.png b/doc/figure/UB-remote-memory-access-v0.png new file mode 100644 index 0000000000000000000000000000000000000000..699bcc95460ab64d5033203b149987c490849577 GIT binary patch literal 26396 zcmeFZc{r5&|398-l&z3ZWNB5&(vc>+HY!CWl)Xhlw(QFoL(zs#B{Eq$k}XEbz9&?6 zVzLg&U2Y=e$3k-}{f>b^WgEdws9>Kj&y>?)!c%&)0K#JRZ;c-dTO! zbv$A`t5&UAck;xs^Q%^^rmR}UG0L?DKC!x%cLV;%;dWm4$g0eSEhDQ|Nv}G2?C=F| z%dsB*EOQgA)Wp#vH(ymRz8$?{e~K^m7@rU+HRQIqi63rxBc@D+KQO9|F~tKb_6QOCt}X{ z%wJDg`2+3d_RkNv{(Hk;i~b+hP}&B6wI=)y!KrYMx&LI<<;swCd%TRu$ckGFb9A{I z-iA%GgmZq*Jf zSLZ}Ce5m#W?GV&t%9n;%waQZL!iV+@GfoDdLrN`da0=(m)_P8`ksn*7qleDPl!Z;O{Y}b8`ifA@k)T;J4wpaE>5xvg@Mtq?!_Csk zQn}7#%O#w;hJQ`Cgx<}8@Y7P7{)cXMyxaN0twE)~uOhe~6EJ{z)_LDR-F^7%w_+c+ zl&ExZhs@+lQIv*DOAF1+g9~JTC$|FEv5#-ZJ98Y`>|CaYo4@pB7LBK=H>lFy*gs?@ zIwcpFFLvD#wcIo_-Xjik$-60$c$JuO(5XBBy*VlTZGTUZmqL0~J~Q)xsC?n@9}_S6 zu_@H67B!S|>P$^e(cX8QTmr+VVP-a`x~A!+WnHhmW*)Z+;19#<2FsUKc*!a5`F3!1 z`;C_w({L6Dtd>_I}n&9XE;AXitsM z9NoZjGS9cy@pJeTN7mHr>$zS>f>Zk8Ky6VtUZVFqr6<^&I$qy(B0>%xy|R>d^6(qD z>HCN^EvkG6Nj-e-_1nAaJ!5FG?#KM6@A@viJi?f*571XNwK90rY^|0_ckL)-PqJyW z9s<@mJjdy3%SeocN_FW0tqZ9%%xABh0>y^kGOu=TnM7^ZPLRU4S=Qu^;QT}`ym{ov zGLq>gR2?jdCuQbr_;gNNvGJ`v>&2V$S1w!By7^oZb&u#8%`Ue`N$UeS8G4)1!|`+SPyjB9u5a`vI{xRy?J4_r6dIEGi2 zbK>MxJV~v&0f$_@rlTs6%uW_?2G8jWS@1vJ*pj9^r)Tsr1<-_mJI`3^Z zc3{}l#XRAK`+aw%8~Qw7Wi5#6<2SI3-ON|p2cD#rbcWZw%3f_;sB(_Mhrf|+*ZMjs zaagv#u=8fiTA?TQ%Bc#8OLs}LqzKnWpSk|fbmEJ2I!4iVT|n!V?i^;TWga~eHf9lh zv(;qz`ueWvvRJ>|5r&_H{qyp(a1vf#=sYC4t*J`ioP@vidxd`vAyV#^Oy>IB)HRGl zTXiGCagz_M%nLX7<`X{1UE5&B&4JQG<&Ka`=5H`UV(LdO6G|7Wt`M9C4hNR+Lj}$6 z^NrK9K4#q+o8P8CA71Y%yn9Hn&ZOv>f6K>VUm2&I?b8;jp4gA-Gmg_=7YuN$*0wzn zt#$kgnkY0y!J1I|?yV#1mQ>gDeA%uHy!L0XoX>5_xp#QqycZ(P6kRLgp)m5R{WFp; zdA$`9HQ4IR@VH{3(r_zmeh0C6VZHi9`D#6@ohG=P-VZWwg*Qkmwh6j*O$*#?jVil% zv(-_ZOCb8VwqpG8MmO&FhptG9I!bzERCXTl5{e%bV^G3d_j%b!EiF&ETD{FRwWiz4 zWNywS=tk9W&SVD7znU~~Gnt7r*P*{F>l|EV>5k^wr*ECa|KnzB5iccoe4Xdrn7cKABIEu97+OkCx}W{cV_Lz+gZVZ@scaT=n$McZv<4{ab*2*J(npU z{pgQC>$odM)Av^44|BgSxE^p;ag82$Qa&(prF}8-LMODf6%7)&3c*vmaK`A~H7?S@ z<)6~ghu^4!;W1L4SZ+80r_SK*UKXs-B&ziV%TzsxV}rDIf&fR?^bsDrO=#bZ0(E{L z^XyUR5ZR{gdjv|gPi5q9x4?V90>g+42%CPSO@+0k3b$ze%msc$=iR2-iVB6rYjpn+ zty94Xrkb2!|0C2V+~BAv9-b20@{i=fUn{m z?JZF=hbVEm(>eA=0DE?MX`ymvtV@(^=deTje`FE6VZsV;)2fgCD}4Q#o>&`k=+>F+ zdCo zmO}!bqu13GEf~ckUg&E4lE8E}F>n9&NEf8L6$ZL%kG7kJv1w^Rc1QQ5X-54=E{{-{zY_PGTnmY%z*{k6gRl6{(!vkswLQ_i z9lNnphLh3!NMpTCi?42!W56c$dT-mkfigmnMl^uwzrZUSB%txR75?4p&E1kyecmU| zFHp|W+dA)NChFcjm343p7|BDSO8ytbz`}7 zb4m?7h95=s2oKM2vNi8|qRAC~z4*Ea7|3}MfjYy?@KV7O#}XVBoySKrMB)g#2D>Dd zQo5qwS4ML^aYB+BCsh5|#Bh?@z_N(ic`KIE_*Z&lF73&(v2)iz(lO@9C^IxAk+w;+ zj?lFbDgPiXcs`$5<*K}p*SVI*P4;dGlIpw-wC>%XLg9M4yXj!GfvVBl0wohQ<%ov{ z@6x@F5Y%D@I`=mBM#H))is_gX#i=`)kcd`gqj6hW%|$Ks3{>}OuntpSpCLSJ-_N>S zk#DUz;y#z(+e|2n_3HJ}b0i*ET^+l%Z>ho|WEFl3*!7n5*NteAeekfF@r;f6GO_P8 z?JhF)+wIk2c(1=zg_YZP+B#`$B44_sbLLjkBlSJe7=z;xQtOLH&N!PE?jIg5IVvJt zR}*_~XSBjiQ&T|>lmrhKMdFJ6eeG0~12-k7>Xo=io0_Za*p_hLOi$as^Gjh5V7vFA z85i$RMh5qxLh0LTszHv=>K(`YaDs3(v{*#@%rspRo1}wI!d% z+vV?-AwlHa*tM5S;Kcca0tv}=Y3UYMHRZo$g4^a&f+*{1WzsRIOc}`heYV2!aqER- z@brhl({nT$a`PQe@L^qXTWdFMI^``CM^O$^eQy^`E^S;|ihTq;-o$ z2SNAk06U^+b0;jKuPDRL`HzF+4xeK%<8KzqI^kU93kXD&%>(@sbNbJ$S#9UZbyu`qaJ@Z4&C^_7GE_B5+7 z`*p`FqJel`@S*zHP|VG5ia#E2QsY3a7Tl!4EcO}5IiXlPb8Gr~Y`R5jNsHZthsrKa z#@wu{F1t}4 zcCK_PlE_E&ZA(vM&o12BLX8p8j9eG73eO2npZ8<#UQ{~wM{D`~jMq=;&7>}z~R>971Ot1!^IwD9+xTmm-~q@(jN z3_g^bkU%#@hj6TtN;tOJJ*d2!^ZOo}zDJyzo3rd$a_?|Pw01K%hvq@${BATluX}HM zW*N1sp|%BmOW5U@GU>|*21dXV5~y3q&byA1;o@?#G2mg!6_x7fCO}+YfeHj}$im}< zv(@n`JX}t0AI~J%G0z}rtThKp-@|g%t%&rqWXZ{l80`vpRz;Qa`$c%xqJ8`88fuqL z?K^N4Ck)Fs3);0GCvyFOByz~|@RGtQZ5P?J7czW0*xF2Dv!JvhAGpO$9tTgM9>9Y| z<(#l+V*!EgFL9DoGf!+-xVBv1g`sytL9AyMR*Bo2qQLh@6*D(%+Af_RsUi0@Zn|T9jo>kScjv6y0*9q-7v)p?GBO`xEKhFxawLuxIgX7Fx<VoH-)C`md>vi#Nc1IVes3AQIG_HacSrBfcM|HOFCNA_-y58Gi{gJqJVU!CDue} z$_r(~1y6dYsxNluKdl}FF##8ykBZcrPthWB($Na0-7Ky*(#ZO1{9Iq;7ZB#o*(UDD zf)c^M`Fp#rwxZPN7J<68X1=SXqLGD@&;L&k`Tu{5cB1pP>xSP3tnAJLy{t^EA26dz z{OYwIAR+(IoT!V`B9#_iJvle07iQ_i04p{YYJ$t8OpCVP1_GjJKlZ#VE$F1d!5Bj? zk2Ws9>0FOv5Xf)h<`##vw9Ii!%>ItX0MEHzP0N$AdH*>d*lkDN}c3-`K;u?)EQb(0@koYYLD2T-5yMT)w@t(Mt*T-1lj` zr>GM0(9P>5)i>#@jlJI+hzXvx$BEWu$G5KmYrT$|?eVh8r^GlMk$8V~R%`kt|DF7> zh_2Bg^&u4_Mdy&wwq^Nf+k@QhQLFGsy~Hb%cLeo>^VWI_0Cc=+UgK}p5U*ek3UxiU< zRl!YH6s1~HM`KTK2t$g8V}PXYy1dUMe}Ayo}4q`61bR+3_4Hc z3i6$|1q(9#`(^WH4NMEhtC?efn1Ih|jQz&Lbz0$(o1zs>nQ|t!FifdYz15;4IsV`? zq$;&VIh0fvci;=dS`D;%EIQQEABJ!V?6!a;s^mj;RHSx8h58ZVVCHBdr+r*!sp~*> z(m`4x^^&mrtLy?2#?}ujqiyYr&*_fDzGqTW#}xegd5`v7$6mm)dSB&TYlzzWDmhgb z&Gi)(R8XKTy%oP9sB3sOIv$|XHj|{PM;i4K?X*P7!pxXi+Gz%jv|+CaE%WJFa(nO$ z8nt@k?*0oy&Gh{^ic5MM#1ie+94HHr8?rlh1nV|K@q|0j z&wsa3p!?mvuCKAi>!npQxwoPDM*_!+zI)Il7!cYbSXm zGt@SN?=zwYx$DkpyAZZJq0Xc`qmDOmqId;d(io^me&%;rGYDnC1HPJ9KJ(?EjCB5> zN%lbvWAZ`MXFK5zF{p;SC$i;zkN;V&;Ld?>JU6I*S@EYsUxmL6o~tq?``m_$@4+eG zaHVa}3JUBqo|!&Nen)d~N858Gt43*v{_^dmU}L2}GNQ^iaa3fC+Py5bA(4h*J;PIC zc#&+h$yi&_S@yaQ$_?gCxl7P)r4qCS-SOopxR6uYj1#Pl2O=G>Y)bzRIT)@*S(`7h zG@W~14gVYZ1=Ue>7VXdXc`E0! zI5+uj?G1oMZ&OkmCfV#h{^f-8K7+DWn=P4XCPr+l7W#m9W+{vMy;}K9$!RYL?Cv|I zqrG;fN}$lDyoZ9iW9o#!;2*sRbo9Bkvw15D{To!DD-h%E=Zh~+I*7~mg#5tssGNXwmJa+C<`Sm^ z$?a|Q&03+oA(=3ENN3_4N8Dd%FT;?gh%W`u^+d!%<1OCB}y@ z28U2gP0E8yHp%o|@MItbd^T8jwzxfh6h1@|3{1<&)RG`il^A%RrVPlq3p3Wfyz(}=^p!97FQZ^MH3DnMo zOfxsB)fF`lPnjn*ajgvY-!B!YBmJD}&m}$%yLLRX)tc)E?Ea9ELJ7Fl@Kam_&y80Yo6;aj)?cuijpnCPp>g5bF5y0Shin<6a^ zn?yK2vKtS{t3+Ldq=qTyRCa@^>}XFB=8ajN{1!FOeWTq4N(e2nbN6;?{IGRWV6i(obbBY4JIu)=@ZeTGF&Onz(SS|uD=8K>d*qM^6z zVb&EC`qsvaA9K}pMX`W?Xu}MO!=Bl23k+Yd7F2UfOwJ~|=Pi5-{Zk9jdS+_~gJL)8 zu_+?>0V$^DQE&0ClGeix4`GY%aPPZfX1VEkKNxr)nxmro71$yMe|)U2s?_s^Yc9u89riu=)0tZ8r|aM;$^SmT|L2wi5QEx>@C`i)P@~kgB|N~zzoDV0av88 zl87ZUoi1%gCUQMtCCf|s*7k7v?Dz6+kZ#H;|0Bo;tGWQWuW&9T zf$m}p!(sj3=cq(|HTEJSJ&`K8vR^CLQlJ26G`qNf<~qIc;swPBiMw0XCFLS?>#z1n z&KlXf7Pwj^zrKz9Uhxu~%j;|Z?HF9Vrf79lRAc4z84H5l9Y=op|6+Gpa{u$3e0yG7 z7NUdr8Zh|P@Sy+I;D63@BODM>BmLm5kDx4!Gbx#V_Sf{0XytF7b1JZWBmA}Tgb48D z+TW3rS*%^C8>lS;CsV|x?}?t61xIuQ9MK=ARZk|I=1_C{>vF@6P$TEWo!Zv_+tYszZMP+O+R~3}PDo4O z`ThG=_X7stoIhDDUa3Z&Z#cV_Vdhk@)7)n;vcHg)$s1+ic}Z2GUFesWgIn-IAtoAl zQy9(VZ_?MSD+Pz5*&g%^{=YO=UGEg0PQ^&d6Zw{$BlwLh`sWGz4ASHU5h}v z($O*|8zi=!E!AeE6eJD~lDfFK1$LXkU{>-tfr}3z2i#=quvI%@s{m1%7!jjp>SPaR z=I0jkfxDAp@hJwNG`K$|QYPicCgn#n<>;wg#=x=tH1)RAcpY?)UBaTKr~kriW)3b0nY(ru zOzt-8o0}Z4;cprSo9Hh6l5Vv`b%7d;-#)%x^K_AD;S#~qQbpd8eSIcvr+A&MRE)@Z zk%`&C;ToWZ%Qb{VX?K30FrH-zPBm~kz*fY>|yx;!=akJ2EYG&i@nrKxIV0}Dr? z6T0Po=Bhc-x&~kKO44S_X0EGf9K{x;8o#SB#~1C%GM?I?NN1#(boHh=o9(X^`eaz{ zpwU=3!dyfb@1ocej?}_PgjYs#BteKnJXe3y6USrNCvsuog2NZe9ZC%K!-S>#l+S4k znrsS!v5;Xbt+xyJxTIyI(C5B<%SuwcS~L>H@*GGJMy2w4N9dIt$z~d0yB2)d4$jY7 zmA>W?>|tGsHpdyC7oI?g)EUMS7^6`>TFCIh5$}iaITri;|J$nkITE6E9JTv}ZCSBv zy`f1LxTBZX$k+4|ucB3jCrs&24X<)? zd7s&M_M>n}LTS%eOo#ROftOA^FYINhYNfv&yqVhY&54y%+H=VM4)CzO1LuiNHE|^Jr;y~x3!KG)oQw-Y6!it0-bQ5j5Re~2Qs-y;mj%eJjOG1C zWc|$apKlYv&>!qhJPf|n1w#FdP~auM>=$t_OM5QJ{%s_((aEZqp$-LSbH>-UgLrpz zaZQrp1E1+**sgTnf9#HMJ^|&hQ-On_N%YMb%V6aqIG;LaaXaG;!Lrey+~G<DA|Cp*GLDOuN0mLkFhbGQ0`!%Hnpl!LJuNP?kcR8(r^gO>xFB&m4?aJCi~{ zC3oIug%cmHGfPv-B?P28BRZu=g@LpAUu6%NvB7ZT|6#brgnql7MjQ&(73ysTDvbfp zXOZE<;TG9$ZPwfS{I9v{t&Bx(W&NHRzP?*kzp(Ag4FNP4pE(@0CA4*W@4jSP7)jfy zzm0?k^6h$`nObp@x<}3#X$cOx>&VwDk{aj8Gim?W9XHPL*X5?4^~#GI(=8qS(n}E} zQQClm$W*7jCNYF2Vxr?6dF0ZaF-8J)MXq0YKrMCio9gA$xq{3yD-P8VehVnBabO_n z7{_t*%Up2Suus-e^o6KD5A%TR)yikohhwj;s9j-+#$2!D`!wX4JoW8bOh@|AKMa?C z-J9V&C{CW!@ox0%6)I)55#`M~6c#@y+{P65uWO?tDAm5v&Ky~Pz^Tl0>-4NxEfbku zC9>)Rn;S2(mO?eBUryXJ@ZYxh)nMrY0!%#%z>%@DcM$K#^2#0~FOQMemo3(onJ;NB zV_RRJVMw{Jb(lcO%SKBba*+mNJW=fH%cG$mDD{MNHo3&lE1#N{{y}Bvg%E;PX)bg$ z>gSmIkLNmeeGoP=7etg{IUb-`6ABQLHQ2_RddLqr=UKZ2>haXx$?~>$EQ8egL^| zm0S0BLD=+yuhwv<12MyrmpSk&S8dZGI2v@u+bfiQv;A?NT$T3TieTN_d$U7{Y8^1s zVXMG7qI!(PQMjIxUb0{L=rSXoz38&UB!${)Ep)W8Vk(L^57RIe`46o%ukIJ}_RSQ} zrHG$gqPYD{CGfVwKWPxnK<`Qf&_wYN{<xQ>@1z%7)bNTXF_#rF?V*YKNPh@ zP>rrx&FAqb^QgN-LXE`kO|wPUKamSP9PTchg7BU4@aI#drK0R4hfiR44W1|ekcf~1 z46RC^ax11O-&#`Rs`YZB7AtWt;E*&ct$v|_NEFup$phVb!r`Lz6mg6bUxdcY#WpAZ zkBLm&5PdP0dgF?Ig?bcAo|W@EEhJ+-WwJ7P?(+G?k8`qC^QSzX{-{z2aGm=uT9!Y0 zciz61oAMyeryUZbC{#gdOC0t4AN_L%PpzXA zwCB~T^yX%m&p20eG7hyLqF;>;Ur^)MTAbjvoygcYJb=IJJ!7qOg9NwYuR1Nz6&B(B zUUOWx3Y0&L-0Xeq!6zD5;%a%l&nreaJ53%Jw))gsy1(tRW~pGcL2`}69rmOYdvYQa zHy7I9qNkA2Su#%{R5mWwHC{-r__48$fg7o_MHkd9G&jQ|O6R+MhGO-$E{3a(b4?n4 zw_3=tG83sYG1FI!Z(MBI%Yk|z_{qb}axKFg}3fP6fY?+7I-lUb-AFC0!0P zw%m9n?TQ?h=f_k`^v1e@fCeZF(y-6%X3tR-akPxQ5Iy;VSA})X)4TnPuh9+TSr_uh za=X$z$;&f&p$`;F4}DA=4)Tdt%8mEnr6?8VaZ?gW(HNqzlw-2bM~#46#!~wuNR!ox z5y|lZzIT(0YB05oMlG%GTeWgQ^AB@Q_m315yPn~pl+6>$F2%}CJ zla#gqio;+jb7W3VotTo1K(p|}Ce&||(+kw*DkyEISuZzum1eUZDX*b>^T&ot`lhHZ zjN}E-@}6-r+Q1x3#O(3rmO_p}^TW!NdWS0AE;~urh~#XNZr#9OfU6X{PN|Vk!MFLr z5%mN?`hF$=-RBgg`f6o5tmA6!jyMMSs9Ovw6MgDia!d#GjfgmtwUf=b8bvAcjsXwa z%o0 za;dAWyr!ggAAT>?KTCVuu#7xpuxsa>u6;uDZ1`j&b^U4adWwmqkBx`By5=&KxSYl_ zJb5mj`yzSKam5z3*bFVJSnlgm*IC%y4Z%L@o!DNVA-Ns{+wN2ik^tWv(WP^5Q;cd@J=_G)A^ zM;pfXfx@>55 zsmhw3Ir<>(4jiiTvz%0G$yu52n+@unvG;!dZ1%;FdUBztKHtpv7tODim_4pT?{L}A z9O`J|WwEZ+o7-}}(?G_;LrXLwI!}x6msd?l1kYcxjm;TdKgKNMWb~I@M>94%Xv#-N z^xnaCTXfKhQfm^HDzz3X+mvM{JD*TYURBvp9g4rN`$4s~RFEo?Gzd#&?}~1`Sci^_ zgK)UpNGZCetO`S5t$!82_(P^<4frj;3SP^zC8QqZy>uH9h`sACkEmDx2Hcbk?l=F%0V%Ei3)2)*W}9x+)kj2wP$GHw`yRytU=Z}MqD!d8 z!7zs;Yf0kKhq?s=GNqH_qpnU7YFA*0`@JI;--K8&go7d%(VoNh`L4M27piEwvVZAA3 z`M8qRwhKM8<0f!lpd127Y{z&bH$ZWdiOUmG)S&Qv*et$IQ~`rf{*jL;O5&+iLzSs2R; zHPlzJIkXzn@*KdU=iJHwcA9dI)@2wqFaSY(6b|LFYEILWWzOEEaj(!7=^c* z&#y84D3P+O7HHY|uNOpiXeT%T)a!6=48U{*R>A_9SlyLnH2mZ>}`6T4BqWcs{r z{lZ?I5Ss9xoXg)ylMuNkbyG{dL-v}ZD$SnLK|TohBHi>H)Ek?VfKrxjA_N+C_$B6j zzm1O3t))4s*vFm4|2BNzI_~iA?!SXBG{`44QW>zi5Fj5kU)hV(4&lAu`$L0JzhBv` z-wM7*DqCi{P*ambYNT-L>ddc({ZJ@nDq}p10l5FEm}YFqQo<-)8bcs&S3p`GWaqv< z!1>+c8+oC3HbmUHcz^NIm@_}EJ%bkjN3RcZdEFZ_i>JQ0^mveryzFMBvmOH*9I!Ip z*aAdK1j`Z2eJHX->C`%5>4|SwBkm|Q9NG@at|Yn>m|VyA`yc~NhT$i_5c{PO_Z69= zl`Yd;px_E!E9IXx6n7Ep1ix?+ZnZ*j0>8m2+`cMQh}~7pSr? zsrv!zoZbeg0dG;{uDugoWlLTCl~T(tciBwoCLLgk6KyVm&_Fm#iL@uV8rhZyU|9;I z#xW`7NYm7i=4`><7k&aawuAPm`Q1L4JI}Z0G=P5qq!~UAuw_Y>ncW|#NrzT_Kuoh7 zNSVj{Qp{2TMNT!=DdXY;j#f(h_jSLoV3T|&tbsS_PY$3ae*jE&%>nzPV*nk~kDL<+ z{stHcD$*Hp z9kzFcmNDc3ZbQ??WS>rg2|glVI)QPq+?=#D%EdW1L!hrJSwSRGgKhShnAGi)Tz0C4qT)zV+W)JM0pKkhWkp^ge%{MgWaszSXNEgz6^k|Ent-IIgm%27nb$#0BcMh=mT z;`34>#~;T=tU3T1rTLt zjTSWs|H;ED>i1bKKEMesBSzA1wUmzIyjvavH&Sl(*#+}Zi z(YG~-cq@&6y7ABe-&u6~KIv%U^9CVcWsHEIF=}FfoVlqC-0A`mFF7sa(&d1pbD_H?eKN$jD186Ujzq}4~ESOikhR01uET8jcFOa1y88IZj=M~@7;OA=E zUmSmM_ZS_u64^5$#U8-%Aps|Uw%1Rzz?8Q40z`oKSs}-z*8&Zs za^a*D3@Gaz45f<>Z$92%?;OB*wE1DlQ3zc} zPoEK#Zqf$_H)LZ6JNIgL--Vft>3GjOLm5A(5EOYR~%sF0={AQgCrA9`n5*_}RatpsAyX6tqxuZaP^@0a_B%^p%R@CPjND9DRS&%Y}V1v%d5c zcu}(XAKiaQ?WjbZbHU&uNwqXQ08nz@$T#vXgmv zSPEgD4xmBwp!!sPBE5{2zE$$rT=$Ts$ylYgKy)ZL3w|R!AmDv z>CnilCm)icEXADi``6q8=Pdo`d0_41-(;i5b2lJnRT;7LE#gk2KSinEM#1k=`jBb# zzgXz#OVX0cT1(Tdl|lyUZ@;#GmWkgY={vktx7N134Y3Xp5C|*uc8iH_pXdwb4*|WN zaDP<5NYn>2*qy33UrpD5I)FcSdX0X97M^m>O}BO-Jy;nQI)N8lu-D6!u#KGGXP=fK zdS(+VtjGPH*5nW1yFT#PA41;}xFM^G{F^kAvQzN|G) z?@J(dmo%I0$6x0C8(SweW3K3nTp|7Fv9`LSM(@*^a{TH&9(qErVD|;dz(Xj4J`MJ<{ zvYCIX$t-C$jhsw5VkrWndbbjKH^wb=4~5Ld4oad~U~mg(^Yl?d=;Dqj4dq{=SdjSc zHj;Ysrxw5sqU;-0+z3pnE6G3`|ar`ta1`FNI&wydnkioL-M` zGLIY5;D%sk{0Q{FC15x-U^ks(1Pdx>pZ!S(9&JEodA92F)n9VFU*q(rEh1b35{Od5 zAJuH9NdDzP_$-lm8EXI3w}Dc4|Js|ij5mzIC@V>XC~!g!BtUY5DqJPYn zd!#D*sL5Ba^(iqs9wh{le1ndUWa&}6wo#47IUy35*SCML1HDHSeHe(_oPb?N8GAGg<6Ne9#`lWy0_7wt3m>hWmH z`@@-2C0to47c#r^%m#IDhc+FwS!QWG;WckO8Hdh&}a!|=GzGDK1i2j5cs|S zi!3x2_|!Y7+Fw#aivXFf(9jvf^=5BheSpy)Z$!_M-_#~E(62=%Oygtr)^fICV&5&X zh-_bvC?5RXR>TR2__b~ z&_LQCZsXW=Hg;|zRBKDXXidOA0w0?T`wUV30BW z6dI!~Y1zX`Y;&f?%`nZ$dj_|?4fDuLJ>+%HeqCK3$`+&8zrDcDbUYZlog?^{kY6za z49G5N%3L3daF#+hWczY@jrS1;+q{;Utd^+Kjhd|Hh!A*Z!fnK-$1fDb^CDI+1-fty zJJM`5J`M~at}o+R`60YdNeA?X9=J&^)wtPPtOBH3x+!y*9==6o1CU!H<7Mn+8;hce z54@F-Ws1&R-?T6pzub!B0d@6Fjg!><=Aa{LV_x?Ve=y2E6&q)QH9Lxk?c9W;Ne(@^ z{;u@nxr1)ldk-BKE7nT!LD!+}U9XF5K+FY}=e5{FNGNyn?x?lQool$b9WMz3lWfX( zT)qRp(UWo|C2XsvX3XKd^R#u0Jr{n0q8hYENNw!J98^u=m+_b=#$Ue*8|Ioit=o z4i&y(7E`-yq;ifU0zigu1_-))7az6d`p?81G>bOD7LC~yW>n2CdG~m$tBpg6#1`xY z!Ts&{`RDQ6&i(`9m9tp#Y*Lr2#&$(pu(5{n7?~y=YfCV)p6BUD-T4|4b?ayXTgo!8 z0mWIXFJ3nCnsCJ5iSNT?vwB#m69b_T_0uCb0$ScNRjJyNL^*Md`%Pxlz@SM$63+-EqLw#9^q^NKcQ76I$=*z}hC9FSHJb!+tzRjA{ZTSpSBD^k5;Ih&U&SjUS!w^;-Z+09xLeoF43#_78C#+&JrQ)Ko6 zjwrteN#jHjZad_Ofu_m|qcy%-2)F!MYhFg;%}F722>4-sB1Pv#R(Y`}g&HVP7V@g} z?Imo+Y`%DM8|evY>%tq*Irbl*H*X)PUnS*C&FxgCIIN>E3aiZ(4d(8=bU^l@2MWJr5tZywKU_1!Zz)rZ+n_Iup>`(id%&>HvCw&s7xE@L>r_w2s zqy@;6iG_&^!{-*STM@j1>9wA$xs~d?sIA%iW4e*O*q9#AUMgdIF&X8n`7dZ$L5#2DwVL}d z!M|Lg#RcVu*2Sg8CjO_~bh1k)9VVA4&daY^^JVN=c1UuZul>jn z$our~Y8j|c9$SGF^U9XSsO*8?M_EH}Yb4WA(|M)L464i;`Wypdfzn5!!~-od^csCY zAV2C5bmHjTCEwmqW_e)w1){s7FZYz0Ev*#xwl7?DyrCmWbh($j;MYdh^LuEyY#cfP z&O z)>j#PI`PIWR5zHErZX@u(PniaWjoG~}^ zdJ_Lqc#WG26nG~yCfZEqtC!Pzv0aaz!d=b}4i_s?yHrNLuFLe?>&KwKZB~6n zvRbXiyua^6!diICgr9caDby2mp?`J+$-?5+=bioSX3iPL7#8KgxZfvYAvH~?&=^Oa zJ7tt!M)!w)F+GxZefAfIrB%-|$c~R}F*J#gr*#>{zdstNJ&Q^$ZXeiFIw_cEQ{Rdm zQ2sL6yx1dJHdnRMSUy`%yG0<;s%L}eqOTWt+tN3M#L#&idOT}1fk@T({NnM8Q`n;T zf|0yK;}qvf2cv-z>ZG8-)rbgJwx_Beiz(sD&&hb%xN5-QR6;FgA-__%Y}d|G6)dz6 zJl`8#L*L{aU^|~LQQ&Ir(9>ZmyS%NPG)<${7uT{f+Yl5zx=fpg$YfQLx}1tKcxr&q zmvH}{tDs3-Zt^@3O?+=2RpIZxk3sDF;&9?+DOP+sXW2FR9SM7HL6I4%lek2nNFTMj z`py#dvtQ`MiG*~_0#wL}?C_(KULXD89@A?sYO#^maJUZb3y`sAd^*7}9HTUt6$?2o z9`hrYVj~xGNR~zJPfON!Lr!hN%%hwmg~CCn;Aa9g<_hUTEFu3!`!&~4EbYH!;;0>(B%I>EnoM^hUC%NOa4Tj+}FebK(5jwbPPQ$9hr z`KpQz+c+L|DRlIDibRi!gXFzZBN@&(6RFZ=V+lqYbJfFR)HAjMK77v zm-s{*F~*&k0JTO^=l-$p%|_Uw;Rgp{wd1wPM5eDemh#+I9y_4|!^hOKz9wr5fB$`E z?k;_&sFlAfljEjRG%*)eUVUbJDSwY^jf70GR6X(j6SMK6QtU&ONDqA;3e#_9h4-%{ zM9$cp-~1#1-sc6cUqccGi;@n5LX}h9zo#9g>?1`-XnIY5opRN+|8B>eRISM?8FRdB z{!K#TNRRrFTS(o9xS2K!=&bgPiM~b!q1<2AtmR%SCjSFI3%UUqd~9-6CwS zAwIKrRM%k0v{sP8;;I7d>sjo1NnF`^MPkHxQgRw8=^IF&yn&7|k?^n!&_2t?PFLQx z3!XnkTZpupkEF6bT>^|H!vC%(g`!|oxCaOFd^@5CC8ey>Tm~P$LneR&G@b(V(962k8 zsA|Wu>R2Al@yyZvE;al+2ws*%c8_`_HfNBk_b4X0weg*W#YpG(9Mue_50>PWu{5KO zLbm|q(3%<1yY4h{UQTY!^C_7d)^Es9+{0YgtrKy!&KEQ%PYOb%0=!3@CW(GMv}f2H z{U}m&jopFqbi)`@ems)%jhoZO82FMoz9{Wbl!aq0sWZt{>QsBzSJvy8#H|JMv%sDt zwaqAMem_J0o?BQdbz;#1`w^6CRJ96>bG0(7_v@mUo47wBjaleS_xAE8MH4*^dP{Pk z3V_BbxNu5q?bfEYhVK0BJ&9iG5efC>J&UJa?DAwZ4A+w}T?7)m-($(FmURVZ!qRVz z3@Rvk$Nzn&?ISd0TN7mK&l)nOFFW)&F~pTX&$hlg5!1azV?}xq9TC8*Ywpa@u<5kc zER>^DW%rS&gdt{oF*W_QM- zJ#Xo*I$-0+xnE6~Mo2YT{&LmpK>23=wQ1;G94okEEOA>UbzcF%r;y?$ zn+R!$WkHeo?g8=NV4{^ic?E!76?|wr6?G8w*~q=yyFu$30rqaM5TzEwXxJb|Q3 zA=H-6RVieKw@2iv_}<2@YOMr~$0vA4sN?+mitdAIPd?CgAu1FuO6)Cr=%qM_Y8|hu z7ptqWFbIGG#5no)8eOFwBwbr-le+Kw>vABfW>V`t2UKfft1ZbNow^%n`LwTH7tN67 zNX7f=n3-h*!E3pn&IZOs^Xp%H8p=HAwjw&vENnx05(QetD-q*GYa5Rxe*s zR7Y)w+sftDH{^dR)DHyM2F)lCGK045Co2{@z1XZqh=}tnZISpku&#Z%zHA=TFG2UN z;U?Y1MCgOASz{(?-owr#Yq?o!sX2+(LL0&adq`vsn7pg1IpuXzYH4qd_ZTZ*!c=MH z{drs~Pt$*C$^z;>4d#{>*{pzCEfDQ#>$LN}Dw z>fsTZY)_;{2qm(#rnqCaum?&Bv8CZN{Z#su2wViMn9-mXB`1p08dqyyt7ydcJG8WO zcHV@OCnWzW6NvY8h~u(lkwTktwE+v&OuCX#`s1N?a>7MIFQ4 zxVaih(an#X+v{_Fojo3VJhs0#|G3B3_w(`je7>I_@9+EldOt5aBFO4N|+}S zcja!VGvK5z>Xc{9|7Xe)(oc1-Y_#`hK^&6j7nw3(t#fOr{wg6br9x=jVKVaN7WH8D;*%x-(|7HKtSOK)9KggYRD;QpbBcd2; zRA6O);uXR7<(=`ax6aY%?}6xAcBFa#dE+uyQ;>?wVSMd?86id8E-Z;#=wiVb|e6{LsdeE`j4#>s)9lLt|p0z(g9(Wcbo#sw=G}D zD=PACo;O$5eEvtif(oPU-XckESV&WepD6R5xXgMT8|K$7*m5l0h)}r!8;`D=t~JJ2i3p2M#sZVA|u@WVXZJ~*RI=3*Q&~&7FhHYmebf5Mr3bdARBpZ zaX>ON=ct=$6;w)D^$Ydy8}uNqdNCv;GT1DS-*dY`&~5M#(t(y8$($aZ6QYN&IIDF} zk=RP!>vrpr7Vr{jCx*Tc1zVgQfo<+E zTl}h2=4H2;Zj>4F8WEq%)$NU=`nOwFuPg5BcPmnuxyJ7>^&-UiBw)H;07>EfgTC4wn z%(+ZVH|_P#^g2n3t%W|xZ24Ie`4|PMr0}CTQFVSd^XX!F7;H0s# zrl)iM$Qo*O2<`hei@r|)#_`3XYe!?)(n6MmR#&58Mu&K!tk&X&XD+y9SkDu>4U9r~ z-;9JnQ%(vq)RwLLm2K`CU%~GcD;fP50!%JVr^2OkAio&UJ{bR21T6gESRv`4je(>4J+?ALbjTYxB@+ElC=%keB6 zjhL6F?KN?!c;orP#s3>p+vNTKu3DD5*hYDsy+V1?=`8ss1~)n{-Tli<@%X}@{1>VP zTAFn&hg>FsBV#@A$Kf70J9ESI%BFPZQ{`bEOi-;AyQzSjQRw_v);yT9yyS#&dHqB+ xOl(SH*fque>%u0{jr59DyoNLAZyNkkrc~^7ovXHKlT-;mc3bQ}$=S?^_!ENfQ3U`1 literal 0 HcmV?d00001 diff --git a/doc/libobmm.md b/doc/libobmm.md new file mode 100644 index 0000000..2308488 --- /dev/null +++ b/doc/libobmm.md @@ -0,0 +1,251 @@ +# libobmm: OBMM 用户态库 + +OBMM 是在单机内管理远端内存的基础组件,可以将本地内存导出(export),将其他系统导出的内存引入(import)。export、import 两方的 OBMM 组件依次完成数据通路配置后,import 侧的应用可以像使用本端内存一样,使用 `load` 、`store` 访问远端内存。 + +OBMM 组件包含用户态库 libobmm.so 和内核模块 obmm.ko(详见obmm(4))。本文档包含 libomm 的功能总览,并介绍其中的关键数据结构。libobmm 的每个函数有专有的文档展开描述。 + +## OBMM API 总览 + +### 导出内存 + +| API | 功能 | +| -------------------- | -------------------------------------------- | +| obmm_export | 从 OBMM 内存池中导出内存,供远端引入 | +| obmm_export_useraddr | 指定一段进程映射的内存,将其导出,供远端引入 | +| obmm_unexport | 取消 OBMM 内存的导出 | + +### 引入内存 + +| API | 功能 | +| ------------- | ------------------ | +| obmm_import | 从远端引入内存 | +| obmm_unimport | 取消远端内存的引入 | + +### 预上线内存 + +| API | 功能 | +| ---------------- | ---------------------------------------- | +| obmm_preimport | 预上线一段远端内存,以加速后面的实际引入 | +| obmm_unpreimport | 取消一段的内存的预上线 | + +### 读写状态维护 + +| API | 功能 | +| ------------------ | -------------------------------------- | +| obmm_set_ownership | 设置内存设备的读写状态,用于一致性维护 | + +### 内存地址查询 + +| API | 功能 | +| ---------------------- | ------------------------------------------------ | +| obmm_query_memid_by_pa | 根据物理地址查询内存设备ID,用于维测 | +| obmm_query_pa_by_memid | 根据内存设备ID和地址偏移量查询物理地址,用于维测 | + +## OBMM 内存 ID + +OBMM 使用64位整数编码内存每段 OBMM 内存,其中 `OBMM_INVALID_MEMID` (0) 为预留 ID,用来标识错误。导出内存和引入内存在同一个 ID 空间。 + +每一段 OBMM 都有一个以 ID 结尾的字符设备(obmm_shmdev(4))和 sysfs目录(obmm_shmdev_sysfs(5))。 + +## OBMM 内存描述符 + +为了描述一段可以跨 host 访问的远端内存,libobmm 定义了通用数据结构 `struct obmm_mem_desc`: + +```c +struct obmm_mem_desc { + uint64_t addr; + uint64_t length; + /* 128bit eid, ordered by small-endian */ + uint8_t seid[16]; + uint8_t deid[16]; + uint32_t tokenid; + uint32_t scna; + uint32_t dcna; + uint16_t priv_len; + uint8_t priv[]; +} +``` + +该数据结构是组网范围内的一段支持跨host访问的内存的通用描述。 + +在提供方,该数据结构主要作为出参,用于获取导出内存的地址参数,少部分域段也用作配置入参。 + +在使用方,该数据结构为入参,用于引入远端内存。 + +**addr** + +* 内存提供方:出参,export流程会输出{tokenid,UBA},其中addr存储UBA,作为UB memory链路报文的核心元素。 +* 内存使用方:入参,表示物理地址; + +**length** + +* 内存提供方:出参,会返回实际export的内存总大小。 +* 内存使用方:入参,指示import内存的大小信息。 + +**tokenid** + +* 内存提供方:出参,export流程会输出{tokenid,UBA},作为UB memory链路报文的核心元素。 +* 内存使用方:入参,忽略该值。 + +**seid** + +* 内存提供方:忽略。 +* 内存使用方:入参,指示本节点访问目标内存时,使用的IODie。 + +**deid** + +* 内存提供方:入参,指示内存借出时,使用的IODie。 +* 内存使用方:入参,仅记录。 + +**scna** + +* 内存提供方:忽略。 +* 内存使用方:入参,指示本节点访问目标内存时,使用的IODie。 + +**dcna** + +* 内存提供方:忽略。 +* 内存使用方:入参,仅记录。 + +**priv_len 和 priv** + +内存专属的黑盒私有数据,用户可以在创建内存时传入,然后从 OBMM sysfs 中读出,读出方法详见 obmm_shmdev_sysfs(5) 。 + +在提供方,私有数据会被透传给UMMU driver。 + +priv_len: 用户私有数据的长度。 + +priv:指向用户的私有数据,即紧随`struct obmm_mem_desc`的、长度为`priv_len`字节的一段连续内存。 + +这两个域段总是OBMM的入参。用户可以通过如 `malloc(sizeof(struct obmm_mem_desc) + priv_len)` 创建能承载priv_len的内存描述符,然后向`desc->priv[i]`写入私有数据的第`i`字节的值。如果不使用私有数据,需要将 `priv_len` 设置为0,以避免非预期的校验失败和越界访问。`priv_len` 的上限是 `OBMM_MAX_PRIV_LEN`(当前为512,考虑到后续接口变动的可能,建议应用自用部分不超过 128 字节,以保证兼容性)。 + +## OBMM 预上线内存描述符 + +为在使用方描述一段预上线内存,libobmm 定义了通用数据结构 `struct obmm_preimport_info`: + +```c +struct obmm_preimport_info { + uint64_t pa; + uint64_t length; + int base_dist; + int numa_id; + uint8_t seid[16]; + uint8_t deid[16]; + uint32_t scna; + uint32_t dcna; + uint16_t priv_len; + uint8_t priv[]; +}; +``` + +该数据结构仅在使用方使用。与 `struct obmm_mem_desc` 相比,`struct obmm_preimport_info` 是对远端内存的一个部分描述。通常只限定了待上线内存需使用的数据链路,但没有精确框定地址范围。是一个部分描述。 + +基于该部分描述,OBMM 可以提前创建一个 NUMA 节点。实际上线时仅需将实际借入的内存“注入”该 NUMA 节点,从而加速关键路径上的软件流程。 + +本节仅介绍字段含义。具体的配置方法和参数功能与场景强相关,详见 obmm_preimport(3), obmm_unpreimport(4)。 + +**pa** + +入参,预上线内存的起始物理地址。 + +**length** + +入参,预上线内存的总长度。 + +**scna** + +入参,指示本节点访问目标内存时,使用的IODie。 + +**dcna** + +入参,指示本节点访问目标内存时,经过的提供方IODie。 + +**seid** + +入参,指示本节点访问目标内存时,使用的IODie。 + +**deid** + +入参,指示内存借出时,使用的IODie。 + +**base_dist** + +入参,表示新上线NUMA到使用方IODie的基础距离,OBMM会根据该距离和预定规则,计算预上线NUMA节点到全部NUMA节点的距离。 + +**numa_id** + +入参和出参:用于指示预上线 NUMA 节点使用的 NUMA ID,-1 表示由系统分配,否则使用传入值作为预上线NUMA ID。由系统分配时,这一字段也作为出参,返回系统分配的 NUMA ID。 + +**priv_len 和 priv** + +入参:用于共同描述priv数据。 + +## 使用模型:借用与共享 + +用户态应用有两种途径访问 OBMM 内存,我们称为借用与共享: + +| 模型 | 特点 | 映射方法 | +| ---- | ------------------------------------------------------------ | -------------------------------------- | +| 借用 | 每段内存只支持一个使用方独占访问
必须为 cacheable 属性
使用 remote NUMA node 管理远端内存 | madvise, mbind, move_pages, numactl 等 | +| 共享 | 每段内存理论上可以让多方交替使用
通过 mmap 字符设备映射内存 | mmap | + +应用通过导出和引入的 flags 来控制自己的使用模型。 + +当应用使用借用模型或使用 noncacheable 属性时,用户无需关心一致性模型,所有的用户均具备读写权限,且无法使用 obmm_set_ownership(3) 调整。 + +当应用通过共享模型使用 cacheable 内存时。需要考虑一致性模型。 + +在主流配置下,一致性模型的OBMM基础粒度为 2M,详见基础粒度章节。 + +对每段OBMM基础粒度的内存,用户可能有空(`PROT_NONE`)、读(`PROT_READ`)、读写(`PROT_WRITE`)三种权限之一。 + +* 内存导出 / 引入后,权限为空 +* 应用映射时,可通过 mmap(2) 的`prot` 参数改配权限 +* 应用映射后,可通过 obmm_set_ownership(3) 切换当前权限 +* 应用通过 munmap(2),权限不会发生变更 +* obmm_unimport(3) 时,会先自动切换为空权限,然后退出 + +在同一时刻,所有访问该内存的各 host(包括提供方和使用方)只能处于如下两种状态之一,否则有数据不一致的风险。 + +1. 所有内存访问进程均为读权限或空权限(没有访问者为写权限) +2. 只有一个 host 上存在具备写权限的进程,其他 host 上的所有映射进程均为空权限 + + +## 操作粒度 + +OBMM 各项操作的粒度受多方限制: + +| 位置 | 组件 | 适用场景 | 粒度 | 典型值 | 其他值 | +| ------ | ------------------------------- | -------------------- | ----------------------- | --------------------- | ---------------------------------- | +| 提供方 | 内核分配器
内核线性映射页表 | 全部 | PMD_SIZE | 2M (4K page) | 32M (16K page)
512M (64K page) | +| 提供方 | UMMU 片上翻译表 | 全部 | 2MB | 2M | 4M, 8M, ..., 256M | +| 提供方 | 内存分配器粒度 | 全部 | 与用户指定的内存分配器器相关 | 2M (4K page) | 1G (4K page) | +| 使用方 | 内存热插 | 借用 | memory_block_size_bytes | 128M (4K or 16K page) | 512M (64K page) | +| 双方 | 进程页表 | 共享 | PAGE_SIZE | 4K | 16K, 64K | +| 双方 | 缓存 home agent | 全部 | 128B | 128B | 128B | + +定义**OBMM基础粒度**为提供方、使用方能一致地传递数据的最小粒度。该粒度为以下三者的最小公倍数: + +* 提供方内核线性映射页大小 +* 进程页大小 +* 缓存更新粒度 + +因为后两者总小于提供方内核线性映射页的大小,因此这一粒度的值为 PMD_SIZE。 + +OBMM 所有接口(obmm_export,obmm_import,obmm_preimport)皆受OBMM基础粒度的限制。 + +导出接口(obmm_export)和引入(obmm_import, obmm_preimport)受上表中适用粒度的最大值限制。 + +除非函数API中特别说明,上述粒度既适用于长度,也适用于相关的地址或偏移量对齐。 + +此外,通过 OBMM 共用内存的提供方、使用方两端,内核的核心编译选项需保持一致,否则 OBMM 的粒度约束可能产生误拦截或漏拦截,影响使用。 + +## 控制面流程时序 + +用户需要注意按以下流程时序进行配置。违反该时序,可能导致进程崩溃、数据不一致、芯片异常等多种非预期现象。 + +1. 提供方export +2. 使用方import +3. 数据访问 +4. 使用方unimport +5. 提供方unexport diff --git a/doc/obmm.md b/doc/obmm.md new file mode 100644 index 0000000..9e7d1c2 --- /dev/null +++ b/doc/obmm.md @@ -0,0 +1,106 @@ +# obmm: 基于所有权的内存管理组件 + +OBMM 是在单机内管理远端内存的基础组件,可以将本地内存导出(export),将其他系统导出的内存引入(import)。export、import 两端的 OBMM 组件依次完成数据通路配置后,import 侧的应用可以像使用本端内存一样,使用 `load` 、`store` 访问远端内存。 + +OBMM 组件包含用户态库 libobmm.so(详见libobmm(3)) 和内核模块 obmm.ko。本文档介绍 obmm 内核模块功能和参数描述。 + +OBMM 具备两方面的功能: + +* 芯片使能:配置芯片通路,使得 `load`、`store` 指令可以在物理上跨 host 执行 +* 软件使能:为远端内存创建易用的软件使用接口,主要包括 remote NUMA 和 OBMM 字符设备(obmm_shmdev(4)) + +OBMM 内核模块插入后,会生成一个 misc 设备 /dev/obmm。该设备通过 ioctl(2) 与用户态(主要是libobmm.so)交互,响应用户导出、引入等增删内存设备的请求。 + +## OBMM 数据通路 + +OBMM 内核模块可以发起多个 UB memory 访存相关的硬件配置 + +* MMU 页表 +* UB memory decoder 翻译表 +* UMMU 翻译表 + +正确配置各组件后,数据通路将被打通,访存流程如下图所示(未显示response) + +![UB-remote-memory-access-v0](figure/UB-remote-memory-access-v0.png) + +### decoder配置 + +UB memory decoder 由高可信硬件配置,使用方 OBMM 以高可信硬件提供的 PA 为起点,配置通路上的其他组件。 + +## OBMM 内存分配 + +### 内存来源 + +UB memory 对提供方的内存有连续性、缓存属性等方面的要求,因此一般要由 OBMM 统一分配。OBMM支持以下内存来源: + +- hugetlb_pmd: 使用pmd映射的hugetlb的方式申请内存。此时每个本地NUMA节点配置的hugetlb页面为申请范围(详见`/sys/devices/system/node/node${numa_id}/hugepages/hugepages-2048kB/`)。 +- hugetlb_pud: 使用pud映射的hugetlb的方式申请内存。此时每个本地NUMA节点配置的hugetlb页面为申请范围(详见`/sys/devices/system/node/node${numa_id}/hugepages/hugepages-1048576kB/`)。在此模式下,OBMM申请内存的粒度会变为1GB。该粒度会影响内存export和mempool申请内存。 +- buddy_highmem: 使用高端地址的方式申请内存。此时pmd_mapping覆盖的部分为申请范围,但是申请范围不保证能被完全申请,这些内存被本地使用时,会持续尝试申请。 + +| 类型 | 申请内存来源 | 申请内存粒度 | UMMU页表粒度配置(支持2M,4M,……,256M \ 最大借出内存为128K * 2 * 页表粒度) | 其他限制 | +| --- | --- | --- | --- | --- | +| hugetlb_pmd | hugetlbfs中的pmd粒度大页 | PMD(PAGE_SIZE为4K时,PMD为2M) | 必须配置为2M | 仅支持pmd_mapping == 100%时使用,需要由用户或者kernel cmdline预留pmd大页内存 | +| hugetlb_pud | hugetlbfs中的pud粒度大页 | PMD(PAGE_SIZE为4K时,PMD为1G) | 可以配置为任意值(推荐32M) | 不受pmd_mapping限制,需要由用户或者kernel cmdline预留pud大页内存 | +| buddy_highmem | 直接从buddy或者使用pfn申请内存 | PMD(PAGE_SIZE为4K时,PMD为2M) | 必须配置为2M | 必须配置pmd_mapping,对pmd_mapping的值没有限制,pmd_mapping的值为借出内存的理论上限 | + +内存来源在插入obmm.ko内核模块时通过mempool_allocator模块参数配置,不允许运行时修改,不允许多个内存来源共存。 + +### 内存池 + +为加速内存分配,OBMM 维护了一个可选的缓冲内存池。该内存池的大小、调整间隔由内核模块参数决定。导出内存时,会优先从缓冲内存池中申请内存,如果缓冲内存池中内存不足,再从上述内存源向系统申请内存。每隔一个调整间隔,OBMM会尝试将内存池填充至其目标大小。本地内存不足,触发OOM事件时,OBMM会将内存池中的内存还给系统使用,一段时间后再尝试重新填充该内存池。 + +## OBMM 部署 + +### 内核启动项 + +OBMM内核模块依赖以下内核参数: +1. 若想要使用NUMA上线功能,需要配置numa_remote参数。若想要使用预上线功能,需进一步配置numa_remote=preonline。 + 该参数详细说明如下: +``` + numa_remote= [ARM64,KNL] + Prepare unused NUMA Nodes as remote Nodes, allows to hotplug + remote memory on these remote NUMA Nodes when CONFIG_NUMA_REMOTE + is enabled. By default, all unused NUMA Nodes will be configured + as remote Nodes. cmdline numa_remote_max_nodes can be used to limit + the number of remote NUMA Nodes. + Format [ard0,][arg1] + preonline - allow to online unready memory and keep them isolated, + to improve the online performance. + nonfallback - the remote nodes don't appear in the zonelists of + other nodes, the remote memory can only be allocated by + specifying the remote node. + hugetlb_nowatermark - allocate hugetlb in remote node will ignore + watermark, and all memory can be allocated as + hugetlb. + - limit the number of remote NUMA Nodes. +``` +2. 为了申请内存时正确修改内核页表属性,需要配置pmd_mapping参数。 + mempool_allocator配置成buddy_highmem时,需要配置pmd_mapping参数,该参数配置比例的系统内存为最大可借出内存;配置成hugetlb_pmd时,需要配置pmd_mapping=100%。 + 该参数详细说明如下: +``` + pmd_mapping= [ARM64,KNL] + Format: nn% + Allows to allocate contiguous memory from special pfn + range, the liner mapping granule of this ranfe is never + larger than PMD. pmd_mapping specifies the percent of + memory of each node. pmd_mapping=100% is used for hugetlb + scenarios, the whole linear mapping isn't large than PMD. +``` + +###前置内核模块依赖 +除了模块本身依赖的ubus,ubcore,hisi_ummu_core,hisi_soc_cache_framework驱动外,为了正确使能刷cache功能,需要插入hisi_soc_hha模块。 +依赖的模块中,以下参数会对OBMM的导入导出产生影响: +- UMMU 模块 ubm_granule参数影响OBMM分配的连续内存最小粒度。该参数配置为0以外的值时,OBMM申请内存的最小粒度须相应提升。即:*该参数配置为0以外的值时,当前OBMM须配置mempool_allocator=hugetlb_pud*。 +- UBUS 模块 um_entry_size参数影响OBMM导入内存的最小粒度。该参数配置成1以外的值时,OBMM导入内存的最小粒度会相应提升。 + +### OBMM模块参数 + +OBMM 内核模块为 `obmm.ko` ,支持下列 3 个内核启动参数: + +1. **mempool_size=(\\d+)[KMG]**:默认为1G,OBMM模块参数,指定每个本地Numa维护的内存池的扩展上限,内存池的维护是动态的,仅保证内存池中的内存快速借出。 +2. **mempool_refill_timeout=(\\d+)**(单位为毫秒):默认为100,OBMM模块参数,在本地内存不足的情况下,内存池会缩减为0,该参数指示,在已经检测到内存不足后,经过一段时间后,尝试重新扩充内存池。 +3. **mempool_allocator**:字符串参数。指定OBMM导出内存时的内存来源。当前支持hugetlb_pmd, hugetlb_pud, buddy_highmem三种。未指定时,通过kernel命令行中的pmd_mapping参数来指定内存来源。pmd_mapping=100%时使用hugetlb_pmd,pmd_mapping<100%时使用buddy_highmem。 + +## OBMM 设备 + +OBMM 内核模块插入后,会生成 `/dev/obmm` 设备。用户应通过 libobmm(3) 描述的函数接口来调用 OBMM 功能,不建议直接操作该设备。 diff --git a/doc/obmm_export.md b/doc/obmm_export.md new file mode 100644 index 0000000..dbae399 --- /dev/null +++ b/doc/obmm_export.md @@ -0,0 +1,214 @@ +# obmm_export: 导出本地内存 + +## 名称 NAME + +`obmm_export`, `obmm_export_useraddr` - 导出本地内存 + +## 库 LIBRARY + +OBMM用户态库 (libobmm) + +## 摘要 SYNOPSIS + +```c +#include +mem_id obmm_export(const size_t length[OBMM_MAX_LOCAL_NUMA_NODES], unsigned long flags, struct obmm_mem_desc *desc); +mem_id obmm_export_useraddr(int pid, void* va, size_t length, unsigned long flags, struct obmm_mem_desc *desc); +``` + +## 描述 DESCRIPTION + +### obmm_export + +内存提供方申请并导出一段内存,供其他机器进行访问。函数返回时,导出的内存已经清零,不会有历史数据泄漏。内存的分配方式详见 obmm(4)。 + +结束使用时,obmm_export 创建的内存,需要用 obmm_unexport(3) 释放。 + +**地址、长度对齐** + +* 传入的内存长度必须按OBMM基础粒度与内存分配器粒度对齐。 + +#### Input Parameters + +**length**:指向一个长度为 `OBMM_MAX_LOCAL_NUMA_NODES` 的数组,数组的第 *i* 个元素的值表示此次export需要从NUMA 节点 *i* 申请的内存大小。 + +length需要满足如下要求: + +1. length[i]必须按OBMM基础粒度对齐,同时满足内存分配器与UMMU的粒度约束。 +2. length[i]非零时,其对应的 NUMA 节点 i 必须为有效的本端 NUMA 节点,且所有这样的 NUMA 节点 i 必须属于同一个CPU Socket。 +3. length的所有元素之和大于零。 + +**flags**:导出内存的属性,支持以下 flag +*OBMM_EXPORT_FLAG_FAST*:仅从内存缓冲池中申请内存进行export操作。若内存缓冲池内存不足,不会向系统申请内存,直接返回错误。 +*OBMM_EXPORT_FLAG_ALLOW_MMAP*: 允许通过mmap对应memid的字符设备的方式,使用该内存。 + +**desc**: 指向一个OBMM内存描述符,用于传入内存的属性参数,同时接收地址信息。其中priv_len、priv域段为入参,addr,length,tokenid域段为出参,其他参数会被忽略。 + +```c +struct obmm_mem_desc { + uint64_t addr; // 出参:返回此次export生成的uba + uint64_t length; // 出参:返回length中各元素之和 + /* 128bit eid, ordered by small-endian */ + uint8_t seid[16]; // export流程忽略 + uint8_t deid[16]; // 入参:指定借出内存所在bus controller的eid + uint32_t tokenid; // 出参:返回此次export生成的tokenid + uint32_t scna; // export流程忽略 + uint32_t dcna; // export流程忽略 + uint16_t priv_len; // 入参:指定priv[]的长度 + uint8_t priv[]; // 入参:可选,用户私有、vendor数据 +} +``` + +### obmm_export_useraddr + +内存提供方对指定进程的一段地址空间调用,pin住并导出这段内存,供其他机器进行访问。 + +结束使用时,obmm_export_useraddr 创建的内存,需要用 obmm_unexport(3) 释放。释放时,这段内存中的数据不会被清理。 + +**限制** +- 该接口调用的目标地址段须按照PMD_SIZE对齐,并且其中的映射粒度需要最小是PMD_SIZE,且不低于UMMU粒度约束。目前可以支持hugetlb,或者THP方式的映射。 +- 该接口调用后,目标内存会被pin住。 +- 该接口调用后,直到unexport执行前,内核态访问目标内存会造成宿主机panic。 + +#### Input Parameters + +**pid**: 被调用进程的pid。 +**va**: 目标内存虚拟地址段的首指针。 +**length**: 目标内存的长度。 +length需要满足如下要求: +1. length必须按OBMM基础粒度对齐,同时满足内存分配器与UMMU的粒度约束。 + +**falgs**: 导出内存的属性,当前仅支持0。 + +**desc**: 指向一个OBMM的内存描述符,用于传入内存的属性参数,同时接收地址信息。其中priv_len、priv域段为入参,addr,length,tokenid域段为出参,其他参数会被忽略。 + +```c +struct obmm_mem_desc { + uint64_t addr; // 出参:返回此次export生成的uba + uint64_t length; // 出参:返回length中各元素之和 + /* 128bit eid, ordered by small-endian */ + uint8_t seid[16]; // export流程忽略 + uint8_t deid[16]; // 入参:指定借出内存所在bus controller的eid + uint32_t tokenid; // 出参:返回此次export生成的tokenid + uint32_t scna; // export流程忽略 + uint32_t dcna; // export流程忽略 + uint16_t priv_len; // 入参:指定priv[]的长度 + uint8_t priv[]; // 入参:可选,用户私有、vendor数据 +} +``` + +## 返回值 RETURN VALUE + +导出成功时,返回内存编号(memid),导出内存的详细属性将被填入desc中。会在 /dev/ 目录下生成对应的 /dev/obmm_shmdev\${memid} 字符设备。详见 obmm_shmdev(4)。 + +失败时,返回 `OBMM_INVALID_MEMID`(0),详细的错误类型存储在`errno`中。 + +## 错误 ERRORS + +故障码对应的部分情形如下: +* `EINVAL` : + * `length`、`desc` 输入参数不能为 NULL; + * 私有数据长度超出`OBMM_MAX_PRIV_LEN`限制(`OBMM_MAX_PRIV_LEN` == 512); + * 每个 NUMA 节点内存大小`OBMM_BASIC_GRANU` == 2 MB整数倍; + * 确保所有 NUMA 节点属于同一 cpu socket 中; + * 每个非零节点内存没有对齐到`OBMM_MEMSEG_SIZE`; + * 导出内存总和大小不能为 0; + * flags 允许值有:`OBMM_EXPORT_FLAG_FAST` 和 `OBMM_EXPORT_FLAG_ALLOW_MMAP`。 +* `ENODEV`: 只允许从已上线的本地NUMA节点分配内存。 +* `ENOMEM`:系统内存不足。 +* `EEXIST:`申请`region` 已存在。 +* `E2BIG`:请求的 NUMA 数量大于系统支持最大值。 +* `ENOSPC`:指定范围内无可用 `memid`。 +* `EOVERFLOW`:内存溢出,请求的内存总大小超出`unsigned long`的范围。 +* `EPERM`: UMMU 设备过多,超出 `MAX_NUM_UMMU_DEVICES`。 +## 约束 CONSTRAINTS + +暂无 + +## 附注 NOTES + +暂无 + +## 样例 EXAMPLES + +以下程序导出了一段长度为 2MB 的内存。该内存在本机可通过 obmm_shmdev(4) 设备映射访问。随后程序通过 obmm_unexport(3) 接口回收了这段内存。 + +```c +#include +#include +#include + +#define SZ_2M (1UL << 21) + +int export_interface_demo(void) +{ + unsigned int device_deid = 0x101; + int ret; + mem_id id; + /* Export 2M memory from node 0 and no memory from other nodes. */ + size_t length[OBMM_MAX_LOCAL_NUMA_NODES] = { SZ_2M }; + /* Allocate memory only from OBMM buffer and create a mappable memory device. */ + unsigned long flags = OBMM_EXPORT_FLAG_FAST | OBMM_EXPORT_FLAG_ALLOW_MMAP; + /* Specify that this memory device has no private data. */ + struct obmm_mem_desc desc = { + .priv_len = 0 + }; + memcpy(desc->deid, &device_deid, 4); + /* Export memory from OBMM. */ + id = obmm_export(length, flags, &desc); + if (id == OBMM_INVALID_MEMID) { + /* Export failed. */ + perror("obmm_export() failed.\n"); + return -1; + } + /* Export succeeded. Key parameters written to @desc. */ + + /* Do your work here... */ + + /* Unexport memory. */ + flags = 0; + ret = obmm_unexport(id, flags); + if (ret) { + perror("obmm_unexport() failed.\n"); + return -1; + } + + return 0; +} + +int export_useraddr_interface_demo(void) +{ + unsigned int device_deid = 0x101; + int ret; + mem_id id; + /* Export 2M memory from node 0 and no memory from other nodes. */ + size_t length[OBMM_MAX_LOCAL_NUMA_NODES] = { SZ_2M }; + /* Allocate memory only from OBMM buffer and create a mappable memory device. */ + unsigned long flags = OBMM_EXPORT_FLAG_FAST | OBMM_EXPORT_FLAG_ALLOW_MMAP; + /* Specify that this memory device has no private data. */ + struct obmm_mem_desc desc = { + .priv_len = 0 + }; + memcpy(desc->deid, &device_deid, 4); + /* Export memory from OBMM. */ + id = obmm_export_useraddr(length, flags, &desc); + if (id == OBMM_INVALID_MEMID) { + /* Export failed. */ + perror("obmm_export() failed.\n"); + return -1; + } + /* Export succeeded. Key parameters written to @desc. */ + + /* Do your work here... */ + + /* Unexport memory. */ + flags = 0; + ret = obmm_unexport(id, flags); + if (ret) { + perror("obmm_unexport() failed.\n"); + return -1; + } + + return 0; +} +``` diff --git a/doc/obmm_import.md b/doc/obmm_import.md new file mode 100644 index 0000000..c80ecf3 --- /dev/null +++ b/doc/obmm_import.md @@ -0,0 +1,175 @@ +# obmm_import: 引入远端内存 + +## 名称 NAME + +`obmm_import` - 引入远端内存 + +## 库 LIBRARY + +OBMM用户态库 (libobmm) + +## 摘要 SYNOPSIS + +```c +#include +mem_id obmm_import(const struct obmm_mem_desc *desc, unsigned long flags, int base_dist, int *numa); +``` + +## 描述 DESCRIPTION + +内存使用方引入一段远端内存,生成字符设备,以设备名以内存ID结尾(/dev/obmm_shmdev\${mem_id})。如需通过mmap方式使用内存,请参考 obmm_shmdev(4)。 + +结束使用时,obmm_import 创建的内存,需要用 obmm_unimport(3) 释放。 + +内存必须满足下列要求,才能被成功引入: + +**地址、长度对齐** + +* 传入的地址和内存长度必须按OBMM基础粒度对齐。 +* NUMA_REMOTE 模式,内存长度和物理地址(如适用)必须按128MB(内核页为4K或16K时)或512MB(内核页为64K时)对齐。 + +**地址不冲突** + +* preimport 模式引入的内存,物理地址必须落在节点预留内存中,且该内存当前未实际上线。 +* 非 preimport 模式引入的内存,物理地址互不重叠,且不能和 preimport 已预留的地址段重叠。 + +#### Input Parameters + +**desc**:指向一个OBMM内存描述符,包含待引入内存的地址、长度、链路等信息。 + +| 字段 | 描述 | +| --------- | --------------------------------------------- | +| addr | 远端内存对应的物理地址(由预填decoder者提供) | +| length | import的内存大小 | +| tokenid | 忽略 | +| deid | 忽略 | +| seid | 使用方引入内存的UB controller的EID | +| scna | 使用方引入内存的UB controller的CNA地址 | +| dcna | 忽略 | +| priv_len | 私有数据长度 | +| priv | 私有数据,仅呈现在sysfs中,不影响通路 | + +**flags**:引入内存的属性。 + +内存引入后的软件接口由以下两个 flags 决定: + +*OBMM_IMPORT_FLAG_NUMA_REMOTE* :指示import的内存需要上线到numa。 +*OBMM_IMPORT_FLAG_ALLOW_MMAP*:指示import后的memid对应的字符设备,支持mmap的使用方式。 + +如上两个FLAG,需要指定且只能指定一个,否则引入会失败。 + +内存引入时,是否使用预引入加速,由以下 flag 指定: + +*OBMM_IMPORT_FLAG_PREIMPORT*:指示本次import内存上线,以预上线的模式进行,以在内存上线NUMA时获得软件加速。 + +使用\匹配预上线节点,addr 指定的物理地址必须落在预留的物理地址范围内。 + +指定*OBMM_IMPORT_FLAG_PREIMPORT*时,必须同时指定*OBMM_IMPORT_FLAG_NUMA_REMOTE*,否则失败。 + +**base_dist**:引入内存上线作为远端 NUMA 节点上线的基础距离。即该远端NUMA节点(n_r)到引入物理芯片所在本地 NUMA (基准NUMA,n_b)的距离。 + +* `base_dist`等于0时,新节点到其他本地节点的距离定义为100,新节点到其他远端节点的距离为254。 +* `base_dist`不等于0且小于等于10,或`base_dist`大于254时,接口报非法参数错误。 +* `base_dist`大于11且小于等于254时,新节点到其他远端节点的距离为254,新节点到其他本地的本地节点的距离为: dist(n, n_b) + dist(n_b, n_r) - dist(n_b, n_b)。如果该值大于 254,则定义距离为254。 + +未指定*OBMM_IMPORT_FLAG_NUMA_REMOTE*时,或指定*OBMM_IMPORT_FLAG_PREIMPORT*时,base_dist参数将被忽略。 + +**numa**:指向一个int值,用于传递引入内存上线为NUMA节点后期望的节点ID。在下列场景中,该参数会被忽略: + +1. 没有指定*OBMM_IMPORT_FLAG_NUMA_REMOTE* flag。 +2. 同时指定了*OBMM_IMPORT_FLAG_NUMA_REMOTE* flag 和 *OBMM_IMPORT_FLAG_PREIMPORT* flag。 + +指定*OBMM_IMPORT_FLAG_NUMA_REMOTE*后,按照如下规则决定上线的NUMA节点: + +* 若同时指定了 *OBMM_IMPORT_FLAG_PREIMPORT*,上线到预上线地址段对应的NUMA节点。 +* 未预上线时,若指针为NULL,或指向的值为NUMA_NO_NODE(-1)时:import上线到新的NUMA节点,指针不是NULL时,通过指针返回远端NUMA节点的ID。 +* 其他情形下,指针指向的值会作为期望上线的NUMA节点ID。节点ID不能配置为本地NUMA节点的ID。 + +如果用户使用该参数指定了已存在的远端remote NUMA进行增量 obmm_import 上线,NUMA distance 可能发生重新配置: + +* 如果为预上线模式的import,不会变更先前设置的NUMA distance。 +* 如果非预上线模式的import,且base_dist为0,不会变更先前设置的NUMA distance。 +* 如果非预上线模式的import,且base_dist为非0值,会根据新参数重配置NUMA distance。 + +## 返回值 RETURN VALUE + +引入成功时,返回内存编号(memid)。同时会在 /dev/ 目录下生成对应的 /dev/obmm_shmdev\${memid} 字符设备。详见 obmm_shmdev(4)。 +如果*numa*参数不是空指针,引入内存实际上线到的NUMA ID会被写入其*numa*指向的内存。如果不涉及NUMA上线,会写入-1。 + +失败时,返回 `OBMM_INVALID_MEMID`(0),详细的错误类型存储在`errno`中。 + +## 错误 ERRORS + +故障码对应的部分情形如下: + +* `EPERM`:NUMA REMOTE相关错误,较常见的是参数错误、资源占用,以及本地内存不足等,例如: + * 传入的 *numa 指定了非法的NUMA ID; + * 模块在加载过程中被卸载; + * 创建 Remote NUMA节点失败。 +* `ENODEV`:传入的scna不对应任何UB controller。 +* `EINVAL`: + * desc 或 flags 为 `OBMM_INVALID_MEMID(0)`; + * priv_len 的长度超出 `OBMM_MAX_PRIV_LEN`; + * 申请内存大小不能为零; + * flags 含有无效标志位;参数 `ALLOW_MMAP`和 `NUMA_REMOTE` 必须且只能指定一个;在指定 `OBMM_IMPORT_FLAG_PREIMPORT` 时,必须指定 `OBMM_IMPORT_FLAG_NUMA_REMOTE`; + * `base_dist` 不等于0且小于等于10,或大于254; + * 传入的地址和内存长度没有按照OBMM基础粒度对齐;pa 为 0 或 pa + size 溢出; + * preimport 模式引入的内存,物理地址没有落在节点预留内存中; + * 导入的 scna、seid、dcna、deid 和预引入时使用 scna、seid、dcna、deid 不匹配。 +* `ENOMEM`: 系统内存不足。 +* `EBUSY`: 物理地址对应内存区间已被占用或冲突。 +* `ENOSPC`:指定范围内无可用 `memid`。 +* `EEXIST`: region 已存在。 +## 约束 CONSTRAINTS + +**地址校验约束** + +需要用户保证传入的物理地址范围正确落在UB远端内存的地址范围内。 + +## 附注 NOTES + +暂无 + +## 样例 EXAMPLES + +以下函数引入了一段物理基地址为 `pa` ,长度为 `size` 的内存。该内存在本机可通过 obmm_shmdev(4) 设备映射访问。随后函数使用 obmm_unimport(3) 停止了这段内存的引入。 + +```c +#include +#include +#include + +int import_demo_decoder(unsigned long pa, size_t size, unsigned int scna, uint8_t *seid) +{ + int ret; + mem_id id; + /* Import the remote memory in such a way that we can open and mmap its char device. */ + unsigned long flags = OBMM_IMPORT_FLAG_ALLOW_MMAP; + /* Specify that this memory device has no private data. */ + struct obmm_mem_desc desc = { + .addr = pa, + .length = size, + .scna = scna, + }; + memcpy(desc.seid, seid, 16); + id = obmm_import(&desc, flags, 0, NULL); + if (id == OBMM_INVALID_MEMID) { + perror("obmm_import() failed.\n"); + exit(EXIT_FAILURE); + } + /* Import succeeded. */ + + /* Do your work here... */ + + /* Unimport memory. */ + ret = obmm_unimport(id, 0); + if (ret) { + perror("obmm_unimport() failed.\n"); + exit(EXIT_FAILURE); + } + + return 0; +} +``` + +预引入相关的样例,请见 obmm_preimport(3) 。 diff --git a/doc/obmm_preimport.md b/doc/obmm_preimport.md new file mode 100644 index 0000000..c6a79dd --- /dev/null +++ b/doc/obmm_preimport.md @@ -0,0 +1,166 @@ +# obmm_preimport: 预引入远端内存 + +## 名称 NAME + +`obmm_preimport` - 预引入远端内存 + +## 库 LIBRARY + +OBMM用户态库 (libobmm) + +## 摘要 SYNOPSIS + +```c +#include +int obmm_preimport(struct obmm_preimport_info *preimport_info, unsigned long flags); +``` + +## 描述 DESCRIPTION + +以NUMA方式引入远端内存时,创建 NUMA 节点会产生较长耗时。 + +预引入可以提前完成NUMA节点创建的工作,它不依赖内存导出时生成的完整信息,因此可以在内存导出之前完成。内存实际上线时,将该段内存”注入“NUMA即可,可以加速关键路径上的软件流程。 + +物理地址段\为匹配索引。 + +实际引入内存时,obmm_import 需要配置 *OBMM_IMPORT_FLAG_PREIMPORT* flag ,同时其参数要命中索引,才能实现引入加速: + +obmm_import 的物理地址段必须为预引入地址段的子集。 + +预引入参数需要满足以下要求: + +**索引唯一** + +每次预上线的物理地址段不能重叠,也不能与非预上线模式引入的内存物理地址重叠。 + +**NUMA有效** + +传入的NUMA ID不为 -1时,NUMA ID必须合法(为非负数)且不能指向本地NUMA。 + +预引入内存需使用 obmm_unpreimport(3) 释放。 + +#### Input Parameters + +**preimport_info**: + +预上线地址段信息。包含组网和通路信息,是组网范围内、通过某一通路引入的内存的部分描述。 + +| 字段 | 描述 | +| --------- | ------------------------------------------------------------ | +| pa | 预上线内存的物理地址基地址 | +| length | 预上线 NUMA 节点所能容纳的最大内存 | +| scna | 指示本节点访问目标内存时,使用的IODie
用于NUMA distance计算和记账 | +| dcna | 指示访问目标内存时,经过的提供方IODie
仅用于记账,不参与通路配置 | +| seid | 指示本节点访问目标内存时,使用的IODie
仅记录,不参与通路配置 | +| deid | 指示访问目标内存时,经过的提供方IODie
仅记录,不参与通路配置 | +| base_dist | 表示新上线NUMA到使用方IODie的基础距离 | +| numa_id | 指示预上线 NUMA 节点使用的 NUMA ID
配置为-1时,用于接收自动分配的 NUMA ID | +| priv_len | 忽略 | +| priv | 忽略 | + +**flags**:选项(预留,当前未使用,必须配置为0)。 + +## 返回值 RETURN VALUE + +成功时,返回0。 + +失败时,返回-1,详细的错误类型存储在`errno`中。 + +## 错误 ERRORS + +故障码对应的部分情形如下: + +* `EPERM`:NUMA REMOTE相关错误,较常见的是参数错误、资源占用等,例如: + * 传入的物理地址或长度不满足NUMA REMOTE的对齐要求; + * 传入的*numa指定了非法的NUMA ID; + * 物理地址资源已被占用。 +* `EINVAL`: 传入参数不符合要求,例如: + * preimport 传入参数错误,如:preimport_info 为空,base_dist 不在合法范围内; + * flags 中含有非法标志位;length 不为零;无效的CNA对。 +* `ENODEV`: scna 无效、seid无效或 scna - seid 不匹配(不属于同一 UB entity)。 +* `ENOMEM`: 系统内存不足导致的错误。 +* `EEXIST`: 内存范围已被占用或冲突。 + +## 约束 CONSTRAINTS + +**地址校验约束** + +需要用户保证传入的物理地址范围正确落在UB远端内存的地址范围内。 + +## 附注 NOTES + +暂无 + +## 样例 EXAMPLES + +以下函数预引入了一段物理基地址为 `pa` ,长度为 `size` 的内存。函数创建了一个新的 remote NUMA 节点,但节点起初并没有可用内存。 +调用 obmm_import(3) 时,内存被注入了 NUMA 节点。最后,程序调用 obmm_unimport(3) 和 obmm_unpreimport(3) 解除了这段内存的上线和预上线。 +现实场景中,预上线一般发生在内存导出之前,很少像本示例一样,在同一函数中顺次调用预上线和上线接口。 + +```c +#include +#include +#include + +int preimport_demo_decoder(unsigned long pa, size_t size, unsigned int scna, uint8_t *seid) +{ + int ret; + mem_id id; + unsigned long import_flags; + /* Preimport [pa, pa+size) range and use a new NUMA node to hold the preimport info. */ + struct obmm_preimport_info info = { + .pa = pa, + .length = size, + .scna = scna, + .base_dist = 0, + .numa_id = -1, + .priv_len = 0 + + }; + memcpy(info.seid, seid, 16); + struct obmm_mem_desc desc = {}; + + ret = obmm_preimport(&info, 0); + if (ret) { + perror("obmm_preimport() failed.\n"); + exit(EXIT_FAILURE); + } + /* Preimport succeeded. The associated NUMA node id is stored in info.numa_id. */ + + /* Import all memory prepared in the preimport phase. OBMM would match the import request + * with preimport range using PA range. */ + desc.addr = pa; + desc.length = size; + desc.scna = scna; + memcpy(desc.seid, seid, 16); + desc.priv_len = 0; + import_flags = OBMM_IMPORT_FLAG_NUMA_REMOTE | OBMM_IMPORT_FLAG_PREIMPORT; + id = obmm_import(&desc, import_flags, 0, NULL); + if (id == OBMM_INVALID_MEMID) { + perror("obmm_import() failed.\n"); + exit(EXIT_FAILURE); + } + /* Import succeeded. */ + + /* Do your work here... */ + + /* Unimport memory. */ + ret = obmm_unimport(id, 0); + if (ret) { + perror("obmm_unimport failed.\n"); + exit(EXIT_FAILURE); + } + + /* To unpreimport a range, info.pa and info.length must exactly + * match the preimport parameters. All other fields have no effect. */ + info.pa = pa; + info.length = size; + ret = obmm_unpreimport(&info, 0); + if (ret) { + perror("obmm_unpreimport() failed.\n"); + exit(EXIT_FAILURE); + } + + return 0; +} +``` diff --git a/doc/obmm_preimport_sysfs.md b/doc/obmm_preimport_sysfs.md new file mode 100644 index 0000000..4c091c4 --- /dev/null +++ b/doc/obmm_preimport_sysfs.md @@ -0,0 +1,26 @@ +# obmm_preimport_sysfs: OBMM 预引入地址段 sysfs + +obmm_preimport sysfs ,为囊括系统内所有预引入地址段的 sysfs 文本文件。 + +该文件为目录为 `/proc/obmm/preimport_info`。 + +对于每一个由 obmm_preimport(3) 创建的预引入地址段,都有在该文件中有一行描述。 + +从左只有,各列依次为: + +* 起始物理地址:十六进制数,预引入地址段的最小有效地址。 +* 结束物理地址:十六进制数,预引入地址段的最大有效地址。 +* scna:十六进制数,内存使用方 bus controller 的 clan network address,含义详见 UB 协议。 +* dcna:十六进制数,内存提供方 bus controller 的 clan network address,含义详见 UB 协议。该列的值没有实际意义。 +* seid:十六进制数,内存使用方 bus controller 的 entity id,含义详见 UB 协议。仅记录,不参与通路配置。以u64 : u64 格式打印 +* deid:十六进制数,内存提供方 bus controller 的 entity id,含义详见 UB 协议。仅记录,不参与通路配置。以u64 : u64 格式打印 +* numa_id:十进制数,预引入地址段所属 的 NUMA 节点。 + +注:为和/proc/iomem, /proc/\$pid/maps 等标准维测文件保持一致,物理地址段没有附带0x前缀,但仍为16进制数表示。 + +示意图: +``` +cat /proc/obmm/preimport_info +start - end : dcna scna deid seid nid +50000000000 - 50007ffffff : 0x0 0x441 0x12:0x13 0x10:0x11 2 +``` \ No newline at end of file diff --git a/doc/obmm_query.md b/doc/obmm_query.md new file mode 100644 index 0000000..4a94f68 --- /dev/null +++ b/doc/obmm_query.md @@ -0,0 +1,59 @@ +# obmm_query: 地址查询转换 + +## 名称 NAME + +`obmm_query_memid_by_pa`, `obmm_query_pa_by_memid` - 实现两种地址描述(*物理地址*和*memid, offset*)之间的双向转换。 + +## 库 LIBRARY + +OBMM用户态库 (libobmm) + +## 摘要 SYNOPSIS + +```c +#include +int obmm_query_memid_by_pa(unsigned long pa, memid *id, unsigned long *offset); +int obmm_query_pa_by_memid(memid id, unsigned long offset, unsigned long *pa); +``` + +## 描述 DESCRIPTION + +`obmm_query_memid_by_pa` 可以根据一个本机的物理地址 `pa`,查询出该地址对应的OBMM内存ID `memid`,以及所指字节在该段内存中的偏移量 `offset`。如果物理地址和OBMM内存无关,函数将返回错误。 + +`obmm_query_pa_by_memid` 可以根据OBMM内存ID `memid`,以及该段内存上的偏移量 `offset`,查出该字节对应的物理地址`pa`。如果`memid`不存在或`offset`越界,函数将返回错误。 + +如果调用者只关注地址是否有效,不关注地址转换后的结果,可以将对应的出参指针配置为`NULL`。 + +**注意**:这里 `offset` 指的是 *UBA偏移量*。UBA 偏移量和虚拟地址偏移量、物理地址偏移量有如下关系 + +* PA offset 和 UBA offset + * 在 export 方,PA一般不连续,二者无明确对应关系 + * 在 import 方,当前版本的硬件上,PA和UBA有线性对应关系,PA offset = UBA offset +* VA offset 和 UBA offset + * 如果 VA 是通过OBMM设备mmap的(对应legacy接口的共享模式),VA和UBA有线性对应关系,VA offset = UBA offset + * 如果 VA 是由 NUMA 管理的,VA和UBA无明确对应关系 + +## 返回值 RETURN VALUE + +如果入参描述的地址为有效的OBMM地址,函数会返回0。当出参指针不是`NULL`指针时,转换后的地址将被写入。 + +如果入参描述的值不是有效的OBMM地址,函数将返回 -1。详细的错误类型存储在`errno`中。 + +## 错误 ERRORS + +故障码对应的部分情形如下: + +* `ENOENT`:ID为`memid`的OBMM内存不存在(原海思 auto-align 接口中,还包括“使用方 pa 没有对应的远端内存”这种情况)。 +* `EINVAL`:`memid`对应的OBMM内存存在,但是`offset` 越界。 + +## 约束 CONSTRAINTS + +本函数预期仅在调试、故障处理等使用场景使用,频繁调用可能导致关键控制面性能劣化,不应在性能敏感的业务面使用,不应高频使用。 + +## 附注 NOTES + +本组函数在 export 方、import 方均可调用。 + +export 方的 PA 对应提供方本地的物理内存(DIMM)。 + +import 方的 PA 为芯片给 UB memory 保留的一段地址窗口,不严格对应物理意义上的内存。内存被 unimport 后,可能被后续新 import 的内存复用。 \ No newline at end of file diff --git a/doc/obmm_set_ownership.md b/doc/obmm_set_ownership.md new file mode 100644 index 0000000..92ec34d --- /dev/null +++ b/doc/obmm_set_ownership.md @@ -0,0 +1,75 @@ +# obmm_set_ownership: 变更进程对 OBMM 内存的权限 + +## 名称 NAME + +`obmm_set_ownership` - 变更进程对 OBMM内存的权限 + +## 库 LIBRARY + +OBMM用户态库 (libobmm) + +## 摘要 SYNOPSIS + +```c +#include +int obmm_set_ownership(int fd, void *start, void *end, int prot); +``` + +## 描述 DESCRIPTION + +变更进程对 OBMM 内存的权限,仅对通过字符设备映射的 cacheable 内存有效,即fd 创建时未指定 O_SYNC 的内存。 + +在 OBMM cacheable 模型中,当本机有至少一个进程以可写方式映射时,本机即具备 host 级的写权限,其他 host 上不应有任何可读或可写的 cachebale 映射。 + +当本机没有进程以可写方式映射,但存在至少一个进程以只读方式映射时,本机即具备 host 机 的读权限,其他 host 上不应有任何可写的 cachebale 映射。 + +请注意: + +* 如果 fd 在创建时指定了 O_SYNC flag,其对应的NC映射不能使用`obmm_set_owenership`。 +* 如果同一个页面被多个进程以可写方式映射,OBMM 仅保证在最后一个具备写权限的进程释放写权限时(转入读、空权限或者解除映射)发起硬件层面的缓存写回。 +* 如果同一个页面被多个进程以只读或读写权限映射,OBMM 仅保证在最后一个具备访问权限的进程释放权限(转入空权限或者解除映射)时会发起硬件的缓存无效化。 +* obmm_set_ownership 不是缓存回刷的唯一触发因素,在系统运行中,缓存被持续使用,硬件会自发地进行缓存逐出,dirty cache 被写入远端内存的时间并不固定。 + +### Input Parameters + +**fd**:obmm_shmdev(4) 内存字符设备的文件描述符。 + +**start**:权限变更的起始虚拟地址,该地址应按照PAGE_SIZE对齐。 + +**end**:权限变更的终止虚拟地址(该地址本身不在变更范围内),按PAGE_SIZE对齐。 +[start, end) 所表示的地址区间应落在 mmap 映射的地址区间内,且长度大于零。 + +**prot**:目标权限状态。 + +- PROT_NONE:无权限 +- PROT_READ:读权限 +- PROT_WRITE (或 PROT_READ | PROT_WRITE):读写权限 + +## 返回值 RETURN VALUE + +成功时返回0。 + +失败时,返回-1,详细的错误类型存储在`errno`中。 + +## 错误 ERRORS + +故障码对应的部分情形如下: + +* `EINVAL`: + * 目标权限只有 PROT_NONE 、 PROT_READ 和 PROT_WRITE; + * 更新操作非法:该映射为 non-cacheable 映射;起始、终止地址未对齐 PAGE_SIZE;start < end。 +* `EBUSY`: + * 目标为 PROT_READ:区间内某个PAGE的读权限映射数量达到最大值; + * 目标为 PROT_WRITE: 区间内某个PAGE的写权限映射数量达到最大值; +* `EFAULT`: 对应的更新区域 vma not found ;待更新内存区域映射的文件和目标设备不一致;更新区域超出 VMA 范围。 +* `ENOTRECOVERABLE`: 缓存刷新失败。 + + +## 约束 CONSTRAINTS + +见 libobmm(3) 所描述的一致性模型。 +当前不允许对NC映射的obmm内存更改一致性状态。 + +## 附注 NOTES + +暂无 diff --git a/doc/obmm_shmdev.md b/doc/obmm_shmdev.md new file mode 100644 index 0000000..d3babae --- /dev/null +++ b/doc/obmm_shmdev.md @@ -0,0 +1,130 @@ +# obmm_shmdev: OBMM 内存设备 + +obmm_shmdev 是 OBMM 为管理远端内存创建的字符设备。若一段 OBMM 内存的 memid 为 `${id}`,其设备所在的路径为`/dev/obmm_shmdev${id}` 。 + +obmm_shmdev 设备为用户态库(libobmm)提供了配置内存设备的 ioctl 接口,为一般用户态应用提供了映射该 OBMM 内存的接口。 + +obmm_shmdev 可以为 export 设备或 import 设备。其详细属性可以通过 obmm_shmdev_sysfs(5) 查看。 + +## 创建与销毁 + +应用应通过 obmm_export(3) 或 obmm_import(3) 创建 OBMM 内存设备。通过 obmm_unexport(3) 或 obmm_unimport(3) 销毁内存设备。 + +## 映射访问 + +obmm_shmdev 仅支持使用 POSIX 接口进行映射。一次标准的映射使用过程应包括对该设备的 open(2)、mmap(2)、load/store访问、munmap(2)、close(2)。 + +### open + +```C +int open(const char *pathname, int flags, ... + /* mode_t mode */ ); +``` + +在操作 obmm_shmdev 时,flags 可以影响 shmdev 的下列行为: + +* 操作权限:O_RDONLY 代表请求只读权限,O_WRONLY 代表请求只写权限(当前无法基于此flag做映射),O_RDWR 代表请求读写权限,三者配置仅配置一项。 +* 同步模式:在 flags 中对 O_SYNC 置位时,使用类似同步IO的语义,远端内存将被以 non-cacheable 当时映射;O_SYNC 未置位时,远端内存将被 cacheable 方式映射。如果芯片对物理地址段的映射模式有限制,O_SYNC 的配置应与之匹配。 + +### mmap + +```C +void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); +``` + +在操作 obmm_shmdev 时,各参数还具有如下含义: + +* addr:用户态应用期望的虚拟地址,默认为 hint(内核不保证能从分配到该地址)。一般可置为 NULL。 +* length:映射内存的长度,此参数必须按页大小对齐。 +* prot:映射内存的访问权限。PROT_NONE表示空权限(无映射、无缓存),PROT_READ表示只读权限,PROT_WRITE 和 PROT_READ | PROT_WRITE 表示读写权限。 +* flags:根据使用规范,用户需配置 MAP_SHARED flag,不应配置 MAP_ANONYMOUS flag。MAP_PRIVATE语义上不适合配置,如果用户仍要配置MAP_PRIVATE,请注意访存行为仍和MAP_SHARED一致,进程的写入对其它进程都是可见的。 +* fd:需为 open(2) 打开 obmm_shmdev 设备所创建的文件描述符 +* offset:映射域段的起始偏移量,此参数必须按页大小对齐。 + +此外,如果fd在打开时指定了O_SYNC的flag,那么该次mmap会以NC的方式进行映射。此时,prot最终会呈现为NC+PROT_NONE/PROT_READ/PROT_WRITE的模式。该映射模式下不允许用户进行一致性变更;且不允许与CC映射的方式混用。 + +mmap 失败时,返回 MAP_FAILED(-1);成功时,会返回一个虚拟地址。应用可基于该虚拟地址进行 load, store 访问。用户所做的访问必须与当前该段内存的权限一致。如果不一致,可能产生 bus error。 + +OBMM支持部分映射,同一进程mmap和munmap的范围必须保持一致,不同进程可以使用不同的页范围进行映射。 + +在访问过程中,用户可通过 obmm_set_ownership(3) 切换权限,实现细粒度的数据跨机共享。注意,NC映射不能使用obmm_set_ownership(3)切换权限。 + +映射时,应用还需要满足以下限制: + +* obmm_shmdev 对应的内存具备 allow_mmap 属性:allow_mmap 属性由创建 OBMM 内存设备的 flags( *OBMM_EXPORT_FLAG_ALLOW_MMAP* 或 *OBMM_IMPORT_FLAG_ALLOW_MMAP*) 配置。 +* 对每一个 cachebale 映射的页,obmm同时允许最多2^16-1个写权限访问者,2^16-1个读权限访问者进行映射以及任意数量的空权限映射;没有 ownership 概念的 non-cacheable 页不受前述限制。 +* 进程实际的映射数上限还受内核文件描述符上限、进程映射数量上限等配置的制约,不仅由 OBMM 的状态上限决定。 + +当 obmm_shmdev 设备被打开或 mmap 时,内存设备均无法销毁。销毁内存设备时,用户应先解除映射,关闭文件描述符。 + +## 样例 + +以下函数展示了应用通过 POSIX 标准接口访问 OBMM 内存设备的过程。函数使用 open(2), mmap(2) 映射内存设备。使用 cacheable 内存时,需要使用 obmm_set_ownership(3) 接口维护 libobmm(3) 中描述的一致性模型。访问结束后,函数使用 munmap(2) 和 close(2) 解除了对设备的映射和占用。 + +```c +#include +#include +#include +#include +#include +#include + +#define SZ_2M (1UL << 21) + +#define MAX_OBMM_MEMDEV_PATH 128 +int memdev_demo(mem_id id, size_t size) +{ + int ret, fd, *value; + void *ptr; + char memdev_path[128]; + + ret = snprintf(memdev_path, sizeof(memdev_path), "/dev/obmm_shmdev%lu", id); + if (ret < 0 || ret >= (int)sizeof(memdev_path)) { + fprintf(stderr, "Failed to construct OBMM memdev path.\n"); + exit(EXIT_FAILURE); + } + + fd = open(memdev_path, O_RDWR); + if (fd == -1) { + perror("open() failed on OBMM memdev.\n"); + exit(EXIT_FAILURE); + } + + /* Map the memory device with NONE access right. */ + ptr = mmap(NULL, size, PROT_NONE, MAP_SHARED, fd, 0); + if (ptr == MAP_FAILED) { + perror("mmap() failed on OBMM memdev.\n"); + exit(EXIT_FAILURE); + } + /* Map char device succeeded. */ + + /* Do your work here. Here is a simple non-atomic increment example. */ + ret = obmm_set_ownership(fd, ptr, (void*)((uintptr_t)ptr + SZ_2M), PROT_WRITE); + if (ret == -1) { + perror("obmm_set_ownership() failed.\n"); + exit(EXIT_FAILURE); + } + value = (int*)ptr; + *value = *value + 1; + ret = obmm_set_ownership(fd, ptr, (void*)((uintptr_t)ptr + SZ_2M), PROT_NONE); + if (ret == -1) { + perror("obmm_set_ownership() failed.\n"); + exit(EXIT_FAILURE); + } + + /* Cleanup.*/ + ret = munmap(ptr, size); + if (ret == -1) { + perror("munmap() failed on OBMM memdev pointer.\n"); + exit(EXIT_FAILURE); + } + + ret = close(fd); + if (ret == -1) { + perror("close() failed on OBMM memdev.\n"); + exit(EXIT_FAILURE); + } + + return 0; +} +``` diff --git a/doc/obmm_shmdev_sysfs.md b/doc/obmm_shmdev_sysfs.md new file mode 100644 index 0000000..fb36a02 --- /dev/null +++ b/doc/obmm_shmdev_sysfs.md @@ -0,0 +1,131 @@ +# obmm_shmdev_sysfs: OBMM 内存设备 sysfs + +obmm_shmdev sysfs ,为 obmm_shmdev(4) 的 sysfs 文件目录。若一段 OBMM 内存的 memid 为 `${id}`,其sysfs 目录所在的路径为`/sys/devices/obmm/obmm_shmdev${id}/` 。 + +通过 sysfs 目录下有多个只读文本文件和只读二进制文件。每个文件展示了该内存设备的一项属性。可用于维测与调试,不建议用于控制面功能、和性能路径。 + +## 属性总览 + +根据内存属性的适用场景,我们可将其分为了三类: + +* 通用信息:提供方、使用方共有的属性,属性文件位于*根路径*下,例如内存长度,类型等。 +* 提供方信息:专属于提供方的信息,属性文件位于`export_info`*子目录*下,如内存在各本地 NUMA 节点上的分布情况。 +* 使用方信息:专属于使用方的信息,属性文件位于`import_info`,如远端映射到本端后的物理地址等。 + +有些属性可能仅在特定条件满足时适用。例如 import_info/numa_id 文件仅在以 NUMA 方式引入时才会出现。 + +``` +# export 内存目录结构示意 +obmm_shmdev${id}/ +├── type +├── ... +├── priv +└── export_info/ + ├── node_mem_size + ├── ... + └── uba +``` + +``` +# import 内存目录结构示意 +obmm_shmdev${id}/ +├── type +├── ... +├── priv +└── import_info/ + ├── pa + ├── ... + └── scna +``` + +这些属性多数都是文本(ASCII)文件,可通过shell 的 `cat`、标准 C 的 `fscanf` 等进行直接操作。少数例外为二进制文件,如用户定义的私有数据 `priv` ,用户可通过 shell 的 `xxd`, POSIX C 的 `read` 进行读取。 + +## 通用信息 + +通用信息位于 `/sys/devices/obmm/obmm_shmdev${mem_id}/` 的根层级,包含如下内容: + +**type** +类型:文本、字符串(`export`、`import`二者之一) +描述:`export` 代表该内存为提供方内存,DRAM位于本地,`import` 代表该内存为从远端引入的内存。 + +**size** +类型:文本、十六进制数 +描述:数值代表该内存的总大小,以字节为单位。如 `0x200000` 代表该内存共 2MB。 + +**priv_len** +类型:文本、十进制数,不大于`OBMM_MAX_PRIV_LEN`的非负数 +描述:内存私有元数据的大小,以字节为单位,由用户创建内存设备时指定。 + +**priv** +类型:二进制文件 +描述:内存私有元数据,内容由用户创建内存设备时传入,文件大小为`OBMM_MAX_PRIV_LEN`,其中超过偏移量超过`priv_len`的部分仍可读取,但其二进制值恒为0。 + +**allow_mmap** +类型:文本、十进制数 +描述:`0` 代表内存 obmm_shmdev(4) 字符设备无法通过 mmap(2) 映射,`1` 代表可以通过 mmap 映射。 + +## 提供方信息 + +**export_info/memory_from_user** +类型:文本、十进制数 +描述:`0` 代表导出内存的由 OBMM 分配,`1` 代表导出内存来自导出进程。 + +**export_info/node_mem_size** +类型:文本、十六进制CSV数组(由`,`分隔的十六进制值) +描述:内存的在各本地 NUMA node 上的分布情况,数值出现的顺序和NUMA节点编号对应,如`0x400000,0x0,0x200000`代表从 node 0、和 node 2 分别上分别申请出了4MB和2MB;数组长度是变长的,总长度由实际提供内存的节点的ID决定,如在4节点机器上如果只有node 1提供内存,那么数组长度是2。 + +**export_info/tokenid** +类型:文本、十六进制数 +描述:该内存所属的 tokenid,含义详见 UB 协议。 + +**export_info/uba** +类型:文本、十六进制数 +描述:该内存对应的 UBA 基地址,含义详见 UB 协议。 + +**export_info/deid** +类型:文本、十六进制数, 以u64 : u64 格式打印 +描述:内存提供方 bus controller 的 entity id,含义详见 UB 协议。 + +## 使用方信息 + +**import_info/pa** +类型:文本、十六进制数 +描述:远端内存引入后,在本端对应的物理地址基地址。 + +**import_info/scna** +类型:文本、十六进制数 +描述:内存使用方 bus controller 的 clan network address,含义详见 UB 协议。 + +**import_info/dcna** +类型:文本、十六进制数 +描述:内存使用方 bus controller 的 clan network address,含义详见 UB 协议。 + +**import_info/seid** +类型:文本、十六进制数, 以u64 : u64 格式打印 +描述:内存提供方 bus controller 的 entity id,含义详见 UB 协议。 + +**import_info/numa_id** +类型:文本、十进制数 +描述:远端内存对应的 remote NUMA ID。如果 obmm_import(3) 时未指定*OBMM_IMPORT_FLAG_NUMA_REMOTE*,没有该文件。 + +**import_info/preimport** +类型:文本、十进制数 +描述:`0` 代表该内存未经过 preimport 加速上线,`1` 代表该内存通过 preimport 方式上线。如果 obmm_import(3) 时未指定*OBMM_IMPORT_FLAG_NUMA_REMOTE*,没有该文件。 + +使用方信息是否适用受 OBMM 工作模式、import flags 配置等多方面的影响。下表为其总览表: + +| 属性 | 适用场景 | +| --------- | ------------------------- | +| pa | 所有场景 | +| scna | 所有场景 | +| deid | 仅记录 | +| seid | 仅记录 | +| numa_id | 内存以 NUMA 方式引入 | +| preimport | 内存以 NUMA 方式引入 | + +# 常见问题 + +使用 C 程序访问时 sysfs接口时,如果出现报错 `Too many open files`,一般有两种情况: + +1. 系统的文件描述符上限过低:可使用 `ulimit -n` 查看当前配置值,通过 `ulimit -n $new_limit` 调高上限 +2. 程序存在文件描述符泄漏:可通过 `ls /proc/$pid/fd | wc -l` 等方法验证进程占用的描述符数量是否合理,然后排查文件中是否有遗漏了 `close(fd)` 语句。 diff --git a/doc/obmm_unexport.md b/doc/obmm_unexport.md new file mode 100644 index 0000000..15c5ea1 --- /dev/null +++ b/doc/obmm_unexport.md @@ -0,0 +1,52 @@ +# obmm_unexport: 取消本地内存导出 + +## 名称 NAME + +`obmm_unexport` - 取消本地内存导出 + +## 库 LIBRARY + +OBMM用户态库 (libobmm) + +## 摘要 SYNOPSIS + +```c +#include +int obmm_unexport(mem_id id, unsigned long flags); +``` + +## 描述 DESCRIPTION + +内存提供方根据内存编号回收导出内存。 + +### Input Parameters + +**id**:要回收的内存的编号。 + +**flags**:选项(预留,当前没有使用,必须配置为0)。 + +## 返回值 RETURN VALUE + +成功时返回0。 + +失败时,返回-1,详细的错误类型存储在`errno`中。 + +## 错误 ERRORS + +故障码对应的部分情形如下: + +* `ENOENT`:传入的 memid 没有对应的 OBMM 内存。 +* `EINVAL`:传入了未定义的 flags,传入的 memid 为 OBMM_INVALID_MEMID (0) 或对应引入内存。 +* `EBUSY` : region 区域已被占用。 + +## 约束 CONSTRAINTS + +作为单机组件,OBMM无法确认导出的内存是否还有远端使用者,提供方回收内存时,**用户需要保证该远端使用者已停止使用该内存**。否则,使用该内存的远端进程将处于不可预测的状态,可能遭遇进程崩溃、数据不一致、硬件故障上报、kernel panic等非预期后果。 + +## 附注 NOTES + +暂无 + +## 样例 EXAMPLES + +见 obmm_export(3) 。 diff --git a/doc/obmm_unimport.md b/doc/obmm_unimport.md new file mode 100644 index 0000000..58ff779 --- /dev/null +++ b/doc/obmm_unimport.md @@ -0,0 +1,54 @@ +# obmm_unimport: 取消远端内存引入 + +## 名称 NAME + +`obmm_unimport` - 取消远端内存引入 + +## 库 LIBRARY + +OBMM用户态库 (libobmm) + +## 摘要 SYNOPSIS + +```c +#include +int obmm_unimport(mem_id id, unsigned long flags); +``` + +## 描述 DESCRIPTION + +内存使用方根据内存编号释放引入内存。 + +当以 cacheable 方式引入的内存被 unimport 时,所有有关的缓存都会被 invalidate ,不会被 writeback 。 +如果需要确保数据变更不丢失,unimport 前需手动调用 obmm_set_ownership(3) 触发回写。 + +### Input Parameters + +**id**:要回收的内存的编号 + +**flags**:选项(预留,当前没有使用,必须配置为0) + +## 返回值 RETURN VALUE + +成功时返回0。 + +失败时,返回-1,详细的错误类型存储在`errno`中。 + +## 错误 ERRORS + +故障码对应的部分情形如下: + +* `ENOENT`:传入的 memid 没有对应的 OBMM 内存。 +* `EINVAL`:传入了未定义的 flags,传入的 memid 为 OBMM_INVALID_MEMID (0) 或对应导出内存。 +* `EBUSY` : region区域已被占用。 +## 约束 CONSTRAINTS + +暂无 + +## 附注 NOTES + +暂无 + +## 样例 EXAMPLES + +见 obmm_import(3) 。 diff --git a/doc/obmm_unpreimport.md b/doc/obmm_unpreimport.md new file mode 100644 index 0000000..ccd9a7d --- /dev/null +++ b/doc/obmm_unpreimport.md @@ -0,0 +1,73 @@ +# obmm_unpreimport: 解除远端内存预引入 + +## 名称 NAME + +`obmm_unpreimport` - 解除远端内存预引入 + +## 库 LIBRARY + +OBMM用户态库 (libobmm) + +## 摘要 SYNOPSIS + +```c +#include +int obmm_unpreimport(const struct obmm_preimport_info *preimport_info, unsigned long flags); +``` + +## 描述 DESCRIPTION + +解除一段内存的预引入。 + +仅有当预上线地址段未实际上线时,预引入才能解除。 + +### Input Parameters + +**preimport_info**:匹配预上线地址段的信息。数据结构描述见 obmm_preimport(3)。 + +通过pa、length信息寻找相应的预上线信息进行释放。 + + +对于上述场景中未使用的信息,obmm不会进行校验。 + +| 字段 | 描述 | +| --------- | ------------------------------------------------------ | +| pa | 预上线内存的物理地址基地址,用于预上线信息匹配 | +| length | 预上线 NUMA 节点所能容纳的最大内存,用于预上线信息匹配 | +| scna | 忽略 | +| dcna | 忽略 | +| seid | 忽略 | +| deid | 忽略 | +| base_dist | 忽略 | +| numa_id | 忽略 | +| priv_len | 忽略 | +| priv | 忽略 | + +**flags**:选项(预留,当前未使用,必须配置为0)。 + +## 返回值 RETURN VALUE + +成功时返回0。 + +失败时,返回-1,详细的错误类型存储在`errno`中。 + +## 错误 ERRORS + +故障码对应的部分情形如下: + +* `ENOENT`:要移除的远程内存块不存在或已被释放。 +* `EINVAL`:传入的 preimport_info 为空;无效的CNA对; flags 包含未定义位;物理地址 pa 对应非预导入区域;卸载区段起始地址和长度未精准匹配。 +* `EAGAIN`: 预导入过程未完成,稍后再试。 +* `EBUSY` : 待卸载区域正在使用,被其他进程占用。 +* `EFAULT`: 未找到物理地址对应信息。 +## 约束 CONSTRAINTS + +暂无 + +## 附注 NOTES + +暂无 + +## 样例 EXAMPLES + +见 obmm_preimport(3) 。 -- Gitee From fffde524af726fa8dde8ee31544be4563a7f69d0 Mon Sep 17 00:00:00 2001 From: Wang Xin Date: Tue, 18 Nov 2025 19:35:18 +0800 Subject: [PATCH 2/3] update libobmm Signed-off-by: Wang Xin --- src/libobmm/CMakeLists.txt | 32 +++ src/libobmm/libobmm.c | 396 +++++++++++++++++++++++++++++++++++ src/libobmm/libobmm.h | 99 +++++++++ src/libobmm/vendor_adaptor.c | 292 ++++++++++++++++++++++++++ src/libobmm/vendor_adaptor.h | 33 +++ 5 files changed, 852 insertions(+) create mode 100644 src/libobmm/CMakeLists.txt create mode 100644 src/libobmm/libobmm.c create mode 100644 src/libobmm/libobmm.h create mode 100644 src/libobmm/vendor_adaptor.c create mode 100644 src/libobmm/vendor_adaptor.h diff --git a/src/libobmm/CMakeLists.txt b/src/libobmm/CMakeLists.txt new file mode 100644 index 0000000..b11d937 --- /dev/null +++ b/src/libobmm/CMakeLists.txt @@ -0,0 +1,32 @@ + +aux_source_directory(${CMAKE_CURRENT_LIST_DIR} LIBOBMM_DIR_SRCS) +add_library(OBMM_SO SHARED ${LIBOBMM_DIR_SRCS}) +target_link_libraries(OBMM_SO PRIVATE) + +target_include_directories(OBMM_SO PRIVATE ${CMAKE_CURRENT_LIST_DIR}) + +set(OBMM_NAME obmm) +set_target_properties(OBMM_SO PROPERTIES OUTPUT_NAME ${OBMM_NAME} + VERSION 1.0.1 + SOVERSION 1) + +target_compile_options(OBMM_SO PRIVATE -Wall -Wextra -Wfloat-equal -fno-common -std=gnu99 + -Wuninitialized -Wno-error -Wno-error=format -Wundef -Wunused -Wdate-time -Wshadow -Wvla + -Wdisabled-optimization -Wempty-body -Wignored-qualifiers -Wimplicit-fallthrough=3 + -Wtype-limits -Wshift-negative-value -Wswitch-default -Wframe-larger-than=8192 + -Wshift-overflow=2 -Wwrite-strings -Wmissing-format-attribute -Wformat-nonliteral + -Wduplicated-cond -Wtrampolines -Wlogical-op -Wsuggest-attribute=format + -Wduplicated-branches -Wmissing-include-dirs -Wformat-signedness -Wmissing-declarations + -Wreturn-local-addr -Wredundant-decls -Wfloat-conversion -Wmissing-prototypes + -Wstrict-prototypes) +if(GCOV) + # coverage test version + target_compile_options(OBMM_SO PRIVATE -O0 -fprofile-arcs -ftest-coverage + -fkeep-inline-functions -fkeep-static-functions) + target_link_libraries(OBMM_SO PRIVATE gcov) +else() + # release version + target_link_options(OBMM_SO PRIVATE -s) +endif() + +install(TARGETS OBMM_SO LIBRARY DESTINATION lib OPTIONAL) diff --git a/src/libobmm/libobmm.c b/src/libobmm/libobmm.c new file mode 100644 index 0000000..897e2b4 --- /dev/null +++ b/src/libobmm/libobmm.c @@ -0,0 +1,396 @@ +/* + * Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved. + * libobmm is licensed under Mulan PSL v2. + * You can use this software according to the terms and conditions of the Mulan PSL v2. + * You may obtain a copy of Mulan PSL v2 at: + * http://license.coscl.org.cn/MulanPSL2 + * THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, + * EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, + * MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. + * + * See the Mulan PSL v2 for more details. + * + * Description: libobmm main api + * Author: Gao Chao + * Create: 2025-10-28 + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "vendor_adaptor.h" +#include "libobmm.h" + +#define NUMA_NO_NODE (-1) +#define OBMM_DEV_PATH "/dev/obmm" + +static int obmm_dev_get_fd(void) +{ + static int obmm_dev_fd = -1; + static pthread_mutex_t obmm_dev_fd_lock = PTHREAD_MUTEX_INITIALIZER; + int errsv = 0; + + pthread_mutex_lock(&obmm_dev_fd_lock); + if (obmm_dev_fd < 0) { + obmm_dev_fd = open(OBMM_DEV_PATH, O_RDWR); + if (obmm_dev_fd < 0) + errsv = errno; + } + pthread_mutex_unlock(&obmm_dev_fd_lock); + errno = errsv; + return obmm_dev_fd; +} + +int obmm_query_memid_by_pa(unsigned long pa, mem_id *id, unsigned long *offset) +{ + struct obmm_cmd_addr_query cmd_addr_query; + int fd, ret; + + fd = obmm_dev_get_fd(); + if (fd < 0) + return fd; + + memset(&cmd_addr_query, 0, sizeof(struct obmm_cmd_addr_query)); + cmd_addr_query.key_type = OBMM_QUERY_BY_PA; + cmd_addr_query.pa = pa; + ret = ioctl(fd, OBMM_CMD_ADDR_QUERY, &cmd_addr_query); + if (ret < 0) + return ret; + + if (id) + *id = cmd_addr_query.mem_id; + if (offset) + *offset = cmd_addr_query.offset; + return 0; +} + +int obmm_query_pa_by_memid(mem_id id, unsigned long offset, unsigned long *pa) +{ + struct obmm_cmd_addr_query cmd_addr_query; + int fd, ret; + + fd = obmm_dev_get_fd(); + if (fd < 0) + return fd; + memset(&cmd_addr_query, 0, sizeof(struct obmm_cmd_addr_query)); + cmd_addr_query.key_type = OBMM_QUERY_BY_ID_OFFSET; + cmd_addr_query.mem_id = id; + cmd_addr_query.offset = offset; + ret = ioctl(fd, OBMM_CMD_ADDR_QUERY, &cmd_addr_query); + if (ret < 0) + return ret; + + if (pa) + *pa = cmd_addr_query.pa; + return 0; +} + +mem_id obmm_export_useraddr(int pid, void* va, size_t length, + unsigned long flags, struct obmm_mem_desc *desc) +{ + struct obmm_cmd_export_pid cmd_export_pid = {0}; + int fd, ret; + + if (desc == NULL) { + errno = EINVAL; + return OBMM_INVALID_MEMID; + } + + fd = obmm_dev_get_fd(); + if (fd < 0) + return OBMM_INVALID_MEMID; + + cmd_export_pid.va = va; + cmd_export_pid.length = length; + cmd_export_pid.pid = pid; + cmd_export_pid.flags = flags; + cmd_export_pid.priv_len = desc->priv_len; + cmd_export_pid.priv = desc->priv; + memcpy(cmd_export_pid.deid, desc->deid, sizeof(cmd_export_pid.deid)); + + ret = vendor_adapt_export(desc, &cmd_export_pid.vendor_info, &cmd_export_pid.vendor_len, + &cmd_export_pid.pxm_numa); + if (ret) { + errno = ret; + return OBMM_INVALID_MEMID; + } + ret = ioctl(fd, OBMM_CMD_EXPORT_PID, &cmd_export_pid); + free_vendor_info((void *)cmd_export_pid.vendor_info); + if (ret < 0) + return OBMM_INVALID_MEMID; + + desc->addr = cmd_export_pid.uba; + desc->length = length; + desc->tokenid = cmd_export_pid.tokenid; + desc->scna = 0; + desc->dcna = 0; + + return cmd_export_pid.mem_id; +} + +mem_id obmm_export(const size_t length[OBMM_MAX_LOCAL_NUMA_NODES], + unsigned long flags, struct obmm_mem_desc *desc) +{ + struct obmm_cmd_export cmd_export; + int fd, i, ret, errsv; + mem_id memid; + + if (length == NULL || desc == NULL) { + errno = EINVAL; + return OBMM_INVALID_MEMID; + } + + fd = obmm_dev_get_fd(); + if (fd < 0) + return OBMM_INVALID_MEMID; + + memset(&cmd_export, 0, sizeof(struct obmm_cmd_export)); + memcpy(cmd_export.size, length, sizeof(size_t) * OBMM_MAX_LOCAL_NUMA_NODES); + cmd_export.length = OBMM_MAX_LOCAL_NUMA_NODES; + cmd_export.flags = flags; + cmd_export.priv_len = desc->priv_len; + cmd_export.priv = desc->priv; + memcpy(cmd_export.deid, desc->deid, sizeof(cmd_export.deid)); + + ret = vendor_adapt_export(desc, &cmd_export.vendor_info, &cmd_export.vendor_len, &cmd_export.pxm_numa); + if (ret) { + errno = ret; + return OBMM_INVALID_MEMID; + } + ret = ioctl(fd, OBMM_CMD_EXPORT, &cmd_export); + errsv = errno; + free_vendor_info((void *)cmd_export.vendor_info); + errno = errsv; + + if (ret < 0) + return OBMM_INVALID_MEMID; + + memid = cmd_export.mem_id; + + desc->addr = cmd_export.uba; + desc->tokenid = cmd_export.tokenid; + desc->scna = 0; + desc->dcna = 0; + desc->length = 0; + for (i = 0; i < OBMM_MAX_LOCAL_NUMA_NODES; i++) + desc->length += length[i]; + + return memid; +} + +static void fill_import_cmd_info(const struct obmm_mem_desc *desc, + struct obmm_cmd_import *cmd_import, + unsigned long flags, int base_dist) +{ + memset(cmd_import, 0, sizeof(struct obmm_cmd_import)); + cmd_import->addr = desc->addr; + cmd_import->length = desc->length; + cmd_import->tokenid = desc->tokenid; + cmd_import->scna = desc->scna; + cmd_import->dcna = desc->dcna; + cmd_import->priv_len = desc->priv_len; + cmd_import->priv = desc->priv; + cmd_import->flags = flags; + cmd_import->base_dist = base_dist; + memcpy(cmd_import->deid, desc->deid, sizeof(cmd_import->deid)); + memcpy(cmd_import->seid, desc->seid, sizeof(cmd_import->seid)); +} + +mem_id obmm_import(const struct obmm_mem_desc *desc, unsigned long flags, + int base_dist, int *numa) +{ + struct obmm_cmd_import cmd_import; + int fd, ret, errsv; + mem_id memid; + + if (desc == NULL) { + errno = EINVAL; + return OBMM_INVALID_MEMID; + } + + if (((flags & OBMM_IMPORT_FLAG_NUMA_REMOTE) && !(flags & OBMM_IMPORT_FLAG_PREIMPORT)) && + (base_dist < 0 || base_dist > UINT8_MAX)) { + errno = EINVAL; + return OBMM_INVALID_MEMID; + } + + fill_import_cmd_info(desc, &cmd_import, flags, base_dist); + + cmd_import.mem_id = 0; + if (numa != NULL) + cmd_import.numa_id = *numa; + else + cmd_import.numa_id = NUMA_NO_NODE; + + fd = obmm_dev_get_fd(); + if (fd < 0) + return OBMM_INVALID_MEMID; + + ret = vendor_fixup_import_cmd(&cmd_import); + if (ret) + return OBMM_INVALID_MEMID; + + ret = ioctl(fd, OBMM_CMD_IMPORT, &cmd_import); + errsv = errno; + vendor_cleanup_import_cmd(&cmd_import); + errno = errsv; + + if (ret < 0) + return OBMM_INVALID_MEMID; + + if (numa != NULL) + *numa = cmd_import.numa_id; + memid = cmd_import.mem_id; + + return memid; +} + +int obmm_unexport(mem_id id, unsigned long flags) +{ + struct obmm_cmd_unexport cmd_unexport; + int fd; + + if (id == OBMM_INVALID_MEMID) { + errno = EINVAL; + return -1; + } + + fd = obmm_dev_get_fd(); + if (fd < 0) + return fd; + + cmd_unexport.mem_id = id; + cmd_unexport.flags = flags; + + return ioctl(fd, OBMM_CMD_UNEXPORT, &cmd_unexport); +} + +int obmm_unimport(mem_id id, unsigned long flags) +{ + struct obmm_cmd_unimport cmd_unimport; + int fd; + + if (id == OBMM_INVALID_MEMID) { + errno = EINVAL; + return -1; + } + + fd = obmm_dev_get_fd(); + if (fd < 0) + return fd; + + cmd_unimport.mem_id = id; + cmd_unimport.flags = flags; + + return ioctl(fd, OBMM_CMD_UNIMPORT, &cmd_unimport); +} + +int obmm_set_ownership(int fd, void *start, void *end, int prot) +{ + uint64_t mem_attr; + struct obmm_cmd_update_range update_info; + + if (prot == PROT_NONE) { + mem_attr = OBMM_SHM_MEM_NORMAL_NC | OBMM_SHM_MEM_NO_ACCESS; + } else if (prot == PROT_READ) { + mem_attr = OBMM_SHM_MEM_NORMAL | OBMM_SHM_MEM_READONLY; + } else if (prot == PROT_WRITE || prot == (PROT_READ | PROT_WRITE)) { + mem_attr = OBMM_SHM_MEM_NORMAL | OBMM_SHM_MEM_READWRITE; + } else { + errno = EINVAL; + return -1; + } + + update_info.start = (uintptr_t)start; + update_info.end = (uintptr_t)end; + update_info.mem_state = mem_attr; + update_info.cache_ops = OBMM_SHM_CACHE_INFER; + + return ioctl(fd, OBMM_SHMDEV_UPDATE_RANGE, &update_info); +} + +int obmm_preimport(struct obmm_preimport_info *preimport_info, unsigned long flags) +{ + struct obmm_cmd_preimport cmd; + int ret, fd, errsv; + + if (preimport_info == NULL) { + errno = EINVAL; + return -1; + } + + if (preimport_info->base_dist < 0 || preimport_info->base_dist > UINT8_MAX) { + errno = EINVAL; + return -1; + } + + fd = obmm_dev_get_fd(); + if (fd < 0) + return fd; + + cmd.pa = preimport_info->pa; + cmd.length = preimport_info->length; + cmd.base_dist = preimport_info->base_dist; + cmd.numa_id = preimport_info->numa_id; + cmd.scna = preimport_info->scna; + cmd.dcna = preimport_info->dcna; + cmd.priv_len = preimport_info->priv_len; + cmd.priv = &preimport_info->priv; + cmd.flags = flags; + memcpy(cmd.deid, preimport_info->deid, sizeof(cmd.deid)); + memcpy(cmd.seid, preimport_info->seid, sizeof(cmd.seid)); + + ret = vendor_fixup_preimport_cmd(&cmd); + if (ret) + return ret; + + ret = ioctl(fd, OBMM_CMD_DECLARE_PREIMPORT, &cmd); + errsv = errno; + vendor_cleanup_preimport_cmd(&cmd); + errno = errsv; + + if (ret < 0) + return ret; + preimport_info->numa_id = cmd.numa_id; + return 0; +} + +int obmm_unpreimport(const struct obmm_preimport_info *preimport_info, unsigned long flags) +{ + struct obmm_cmd_preimport cmd; + int fd; + + if (preimport_info == NULL) { + errno = EINVAL; + return -1; + } + + fd = obmm_dev_get_fd(); + if (fd < 0) + return fd; + + cmd.pa = preimport_info->pa; + cmd.length = preimport_info->length; + cmd.base_dist = preimport_info->base_dist; + cmd.numa_id = preimport_info->numa_id; + cmd.scna = preimport_info->scna; + cmd.dcna = preimport_info->dcna; + cmd.priv_len = preimport_info->priv_len; + cmd.priv = &preimport_info->priv; + cmd.flags = flags; + memcpy(cmd.deid, preimport_info->deid, sizeof(cmd.deid)); + memcpy(cmd.seid, preimport_info->seid, sizeof(cmd.seid)); + + return ioctl(fd, OBMM_CMD_UNDECLARE_PREIMPORT, &cmd); +} diff --git a/src/libobmm/libobmm.h b/src/libobmm/libobmm.h new file mode 100644 index 0000000..b7a78e0 --- /dev/null +++ b/src/libobmm/libobmm.h @@ -0,0 +1,99 @@ +/* + * Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved. + * libobmm is licensed under Mulan PSL v2. + * You can use this software according to the terms and conditions of the Mulan PSL v2. + * You may obtain a copy of Mulan PSL v2 at: + * http://license.coscl.org.cn/MulanPSL2 + * THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, + * EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, + * MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. + * + * See the Mulan PSL v2 for more details. + * + * Description: libobmm main api + * Author: Gao Chao + * Create: 2025-10-28 + */ + +#ifndef _OBMM_API_H +#define _OBMM_API_H + +#include +#include +#include + +#if defined(__cplusplus) +extern "C" { +#endif + +#define MAX_NUMA_NODES 16 +#define OBMM_INVALID_MEMID 0 + +typedef uint64_t mem_id; + +struct obmm_mem_desc { + uint64_t addr; + uint64_t length; + /* 128bit eid, ordered by little-endian */ + uint8_t seid[16]; + uint8_t deid[16]; + uint32_t tokenid; + uint32_t scna; + uint32_t dcna; + uint16_t priv_len; + uint8_t priv[]; +}; + +struct obmm_preimport_info { + uint64_t pa; + uint64_t length; + int base_dist; + int numa_id; + uint8_t seid[16]; + uint8_t deid[16]; + uint32_t scna; + uint32_t dcna; + /* mar_id, etc */ + uint16_t priv_len; + uint8_t priv[]; +}; + +mem_id obmm_export(const size_t length[OBMM_MAX_LOCAL_NUMA_NODES], unsigned long flags, struct obmm_mem_desc *desc); +int obmm_unexport(mem_id id, unsigned long flags); + + +int obmm_preimport(struct obmm_preimport_info *preimport_info, unsigned long flags); +int obmm_unpreimport(const struct obmm_preimport_info *preimport_info, unsigned long flags); + +/* Export the specified va range of the process pid out of localhost. + * Due to hardware limitations, during the export process, the corresponding + * physical memory for the VA (virtual address) range will be allocated and + * pinned, and the related pages will be checked to see if they are 2M pages. + * + * pid: the ID of the process in which va range are to exported. If pid is 0, + * export va range of the calling process. + **/ +mem_id obmm_export_useraddr(int pid, void* va, size_t length, unsigned long flags, struct obmm_mem_desc *desc); + +mem_id obmm_import(const struct obmm_mem_desc *desc, unsigned long flags, int base_dist, int *numa); +int obmm_unimport(mem_id id, unsigned long flags); + +/* + * Set the ownership (reader, writer, none) of a range of OBMM virtual address space. + * @fd: The file descriptor of an OBMM memory device. + * @start: The start virutal address. + * @end: The end virtual addreses. + * @prot: The ownership expressed as memory protection bits (PROT_NONE, PROT_READ, PROT_WRITE). + * NOTE: PROT_WRITE implies PROT_READ. + */ +int obmm_set_ownership(int fd, void *start, void *end, int prot); + +/* debug interface */ +int obmm_query_memid_by_pa(unsigned long pa, mem_id *id, unsigned long *offset); +int obmm_query_pa_by_memid(mem_id id, unsigned long offset, unsigned long *pa); + +#if defined(__cplusplus) +} +#endif + +#endif diff --git a/src/libobmm/vendor_adaptor.c b/src/libobmm/vendor_adaptor.c new file mode 100644 index 0000000..c7d86ff --- /dev/null +++ b/src/libobmm/vendor_adaptor.c @@ -0,0 +1,292 @@ +/* + * Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved. + * libobmm is licensed under Mulan PSL v2. + * You can use this software according to the terms and conditions of the Mulan PSL v2. + * You may obtain a copy of Mulan PSL v2 at: + * http://license.coscl.org.cn/MulanPSL2 + * THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, + * EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, + * MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. + * + * See the Mulan PSL v2 for more details. + * + * Description: libobmm main api + * Author: Gao Chao + * Create: 2025-10-28 + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include "vendor_adaptor.h" + +#define pr_err(fmt, ...) fprintf(stderr, "libobmm: [vendor-adaptor][ERROR]" fmt, ##__VA_ARGS__) + +#define EID_FMT64 "%#lx:%#lx" +#define EID_ARGS64(eid) (*(uint64_t *)&(eid)[8]), (*(uint64_t *)&(eid)[0]) + +#define EID_SIZE 16 +#define MAX_CONTROLLERS 8 +#define MAX_PATH 256 +#define MAX_CHAR 64 +#define INVAL_UMMU_MAPPING (-1) + +enum hisi_ummu_tdev_version { + HISI_TDEV_INFO_V1 = 0, +}; + +struct hisi_ummu_tdev_info { + enum hisi_ummu_tdev_version ver; + union { + struct { + unsigned long ummu_idx_mask; // ummu_mapping mask + bool on_chip; // sram / dram + } v1; + }; +}; + +struct ub_bus_ctl_node { + int ummu_mapping; + int numa_id; + bool valid; +}; + +static uint8_t g_invalid_eid[16]; + +static int read_int_from_file(const char *path) +{ + FILE *fp = fopen(path, "r"); + char str[MAX_CHAR], *end; + size_t nread; + long ret; + + if (!fp) { + pr_err("failed to open file %s.\n", path); + return -1; + } + + nread = fread(str, 1, sizeof(str) - 1, fp); + if (nread == 0) { + pr_err("failed to read data from %s.\n", path); + (void)fclose(fp); + return -1; + } + (void)fclose(fp); + str[nread] = '\0'; + /* hex and decimal are possible */ + ret = strtol(str, &end, 0); + if (end == str) { + pr_err("failed to parse int value from '%s' in %s.\n", str, path); + return -1; + } + if (ret > INT_MAX || ret < INT_MIN) { + pr_err("read occured overflowed %s.\n", path); + return -1; + } + return (int)ret; +} + +static int get_ubc_attr(const char *ubc_path, const char *attr) +{ + char attr_path[MAX_PATH]; + int ret; + + ret = snprintf(attr_path, sizeof(attr_path), "%s/%s", ubc_path, attr); + if (ret <= 0) + return -1; + return read_int_from_file(attr_path); +} + +static int get_ubc_path(int ubc_index, char *ubc_path, size_t path_len) +{ + char pattern[MAX_PATH], *glob_path; + glob_t g; + int ret; + + (void)snprintf(pattern, sizeof(pattern), "/sys/devices/ub_bus_controller%d/*/ubc", ubc_index); + + ret = glob(pattern, 0, NULL, &g); + if (ret != 0) { + globfree(&g); + return ENODEV; + } + if (g.gl_pathc == 0) { + globfree(&g); + return ENODEV; + } + glob_path = dirname(g.gl_pathv[0]); + if (strlen(glob_path) >= path_len) { + globfree(&g); + return EINVAL; + } + (void)snprintf(ubc_path, path_len, "%s", glob_path); + globfree(&g); + return 0; +} + +static int get_ubc_by_eid(unsigned int *uba_index, char *ubc_path, size_t path_len, const uint8_t *eid) +{ + for (unsigned int i = 0; i < MAX_CONTROLLERS; i++) { + int ret = get_ubc_path(i, ubc_path, path_len); + if (ret) + continue; + + ret = get_ubc_attr(ubc_path, "eid"); /* host endian */ + if (ret < 0) { + pr_err("failed to read ctl eid, path %s.\n", ubc_path); + errno = ENODEV; + return -1; + } + + uint8_t sysfs_eid[EID_SIZE] = {}; + *(unsigned int*)sysfs_eid = (unsigned int)ret; + + if (memcmp(sysfs_eid, eid, EID_SIZE) != 0) + continue; + + *uba_index = i; + return 0; + } + pr_err("failed to find ctl, eid:" EID_FMT64 ".\n", EID_ARGS64(eid)); + errno = ENODEV; + return -1; +} + +static struct ub_bus_ctl_node get_ctl_by_eid(uint8_t *eid) +{ + struct ub_bus_ctl_node node = {0}; + char ubc_path[MAX_PATH]; + unsigned int ubc_index; + + int ret = get_ubc_by_eid(&ubc_index, ubc_path, sizeof(ubc_path), eid); + if (ret) + return node; + + node.ummu_mapping = get_ubc_attr(ubc_path, "ummu_map"); + if (node.ummu_mapping < 0) { + pr_err("failed to read ctl ummu_map, path %s.\n", ubc_path); + return node; + } + + node.numa_id = get_ubc_attr(ubc_path, "numa"); + if (node.numa_id < 0) { + pr_err("failed to read ctl numa, path %s.\n", ubc_path); + return node; + } + node.valid = true; + return node; +} + +static int get_primary_cna_by_eid(unsigned int *cna, const uint8_t *eid) +{ + char ubc_path[MAX_PATH]; + unsigned int ubc_index; + + int ret = get_ubc_by_eid(&ubc_index, ubc_path, sizeof(ubc_path), eid); + if (ret) + return ret; + + ret = get_ubc_attr(ubc_path, "primary_cna"); + if (ret < 0) { + pr_err("failed to read ctl primary_cna, path %s.\n", ubc_path); + errno = ENODEV; + return -1; + } + *cna = (unsigned int)ret; + + return 0; +} + +static int init_vendor_info(int ummu_mapping, const void **vendor_info, uint16_t *vendor_len) +{ + struct hisi_ummu_tdev_info *info = (struct hisi_ummu_tdev_info *)calloc(1, sizeof(*info)); + + if (!info) + return ENOMEM; + + if (sizeof(struct hisi_ummu_tdev_info) > OBMM_MAX_VENDOR_LEN) { + free(info); + return EINVAL; + } + + info->ver = HISI_TDEV_INFO_V1; + info->v1.on_chip = true; + info->v1.ummu_idx_mask = 1 << ummu_mapping; + *vendor_info = info; + *vendor_len = sizeof(struct hisi_ummu_tdev_info); + return 0; +} + +int vendor_adapt_export(struct obmm_mem_desc *desc, const void **vendor_info, + uint16_t *vendor_len, int *numa) +{ + struct ub_bus_ctl_node node; + int ret; + + if (memcmp(desc->deid, g_invalid_eid, sizeof(desc->deid)) == 0) { + pr_err("zero-type eid is not allowed.\n"); + return EINVAL; + } + node = get_ctl_by_eid(desc->deid); + if (!node.valid) + return ENODEV; + + ret = init_vendor_info(node.ummu_mapping, vendor_info, vendor_len); + if (ret) { + pr_err("init_vendor_info failed, ret %d.\n", ret); + return ret; + } + *numa = node.numa_id; + return 0; +} + +void free_vendor_info(void *vendor_info) +{ + free(vendor_info); +} + +int vendor_fixup_import_cmd(struct obmm_cmd_import *cmd) +{ + unsigned int cna; + int ret = get_primary_cna_by_eid(&cna, cmd->seid); + if (ret) + return ret; + if (cna != cmd->scna) { + pr_err("ctl with eid " EID_FMT64 " has scna=%#x which is different from scna=%#x.\n", + EID_ARGS64(cmd->seid), cna, cmd->scna); + errno = ENODEV; + return -1; + } + return 0; +} + +void vendor_cleanup_import_cmd(struct obmm_cmd_import *cmd) +{ +} + +int vendor_fixup_preimport_cmd(struct obmm_cmd_preimport *cmd) +{ + unsigned int cna; + int ret = get_primary_cna_by_eid(&cna, cmd->seid); + if (ret) + return ret; + if (cna != cmd->scna) { + pr_err("ctl with eid " EID_FMT64 " has scna=%#x which is different from scna=%#x.\n", + EID_ARGS64(cmd->seid), cna, cmd->scna); + errno = ENODEV; + return -1; + } + return 0; +} + +void vendor_cleanup_preimport_cmd(struct obmm_cmd_preimport *cmd) +{ +} diff --git a/src/libobmm/vendor_adaptor.h b/src/libobmm/vendor_adaptor.h new file mode 100644 index 0000000..834f303 --- /dev/null +++ b/src/libobmm/vendor_adaptor.h @@ -0,0 +1,33 @@ +/* + * Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved. + * libobmm is licensed under Mulan PSL v2. + * You can use this software according to the terms and conditions of the Mulan PSL v2. + * You may obtain a copy of Mulan PSL v2 at: + * http://license.coscl.org.cn/MulanPSL2 + * THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, + * EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, + * MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. + * + * See the Mulan PSL v2 for more details. + * + * Description: libobmm main api + * Author: Gao Chao + * Create: 2025-10-28 + */ + +#ifndef _VENDOR_ADAPTOR_H +#define _VENDOR_ADAPTOR_H + +#include + +int vendor_adapt_export(struct obmm_mem_desc *desc, const void **vendor_info, + uint16_t *vendor_len, int *numa); +void free_vendor_info(void *vendor_info); + +int vendor_fixup_import_cmd(struct obmm_cmd_import *cmd); +void vendor_cleanup_import_cmd(struct obmm_cmd_import *cmd); + +int vendor_fixup_preimport_cmd(struct obmm_cmd_preimport *cmd); +void vendor_cleanup_preimport_cmd(struct obmm_cmd_preimport *cmd); + +#endif \ No newline at end of file -- Gitee From d3bc0351e0f19141832e893d328d3747b464c75f Mon Sep 17 00:00:00 2001 From: Wang Xin Date: Tue, 18 Nov 2025 19:52:00 +0800 Subject: [PATCH 3/3] update LICENSE Signed-off-by: Wang Xin --- License/LICENSE | 127 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 127 insertions(+) create mode 100644 License/LICENSE diff --git a/License/LICENSE b/License/LICENSE new file mode 100644 index 0000000..e5a286c --- /dev/null +++ b/License/LICENSE @@ -0,0 +1,127 @@ + 木兰宽松许可证, 第2版 + + 木兰宽松许可证, 第2版 + 2020年1月 http://license.coscl.org.cn/MulanPSL2 + + + 您对“软件”的复制、使用、修改及分发受木兰宽松许可证,第2版(“本许可证”)的如下条款的约束: + + 0. 定义 + + “软件”是指由“贡献”构成的许可在“本许可证”下的程序和相关文档的集合。 + + “贡献”是指由任一“贡献者”许可在“本许可证”下的受版权法保护的作品。 + + “贡献者”是指将受版权法保护的作品许可在“本许可证”下的自然人或“法人实体”。 + + “法人实体”是指提交贡献的机构及其“关联实体”。 + + “关联实体”是指,对“本许可证”下的行为方而言,控制、受控制或与其共同受控制的机构,此处的控制是指有受控方或共同受控方至少50%直接或间接的投票权、资金或其他有价证券。 + + 1. 授予版权许可 + + 每个“贡献者”根据“本许可证”授予您永久性的、全球性的、免费的、非独占的、不可撤销的版权许可,您可以复制、使用、修改、分发其“贡献”,不论修改与否。 + + 2. 授予专利许可 + + 每个“贡献者”根据“本许可证”授予您永久性的、全球性的、免费的、非独占的、不可撤销的(根据本条规定撤销除外)专利许可,供您制造、委托制造、使用、许诺销售、销售、进口其“贡献”或以其他方式转移其“贡献”。前述专利许可仅限于“贡献者”现在或将来拥有或控制的其“贡献”本身或其“贡献”与许可“贡献”时的“软件”结合而将必然会侵犯的专利权利要求,不包括对“贡献”的修改或包含“贡献”的其他结合。如果您或您的“关联实体”直接或间接地,就“软件”或其中的“贡献”对任何人发起专利侵权诉讼(包括反诉或交叉诉讼)或其他专利维权行动,指控其侵犯专利权,则“本许可证”授予您对“软件”的专利许可自您提起诉讼或发起维权行动之日终止。 + + 3. 无商标许可 + + “本许可证”不提供对“贡献者”的商品名称、商标、服务标志或产品名称的商标许可,但您为满足第4条规定的声明义务而必须使用除外。 + + 4. 分发限制 + + 您可以在任何媒介中将“软件”以源程序形式或可执行形式重新分发,不论修改与否,但您必须向接收者提供“本许可证”的副本,并保留“软件”中的版权、商标、专利及免责声明。 + + 5. 免责声明与责任限制 + + “软件”及其中的“贡献”在提供时不带任何明示或默示的担保。在任何情况下,“贡献者”或版权所有者不对任何人因使用“软件”或其中的“贡献”而引发的任何直接或间接损失承担责任,不论因何种原因导致或者基于何种法律理论,即使其曾被建议有此种损失的可能性。 + + 6. 语言 + “本许可证”以中英文双语表述,中英文版本具有同等法律效力。如果中英文版本存在任何冲突不一致,以中文版为准。 + + 条款结束 + + 如何将木兰宽松许可证,第2版,应用到您的软件 + + 如果您希望将木兰宽松许可证,第2版,应用到您的新软件,为了方便接收者查阅,建议您完成如下三步: + + 1, 请您补充如下声明中的空白,包括软件名、软件的首次发表年份以及您作为版权人的名字; + + 2, 请您在软件包的一级目录下创建以“LICENSE”为名的文件,将整个许可证文本放入该文件中; + + 3, 请将如下声明文本放入每个源文件的头部注释中。 + + Copyright (c) [Year] [name of copyright holder] + [Software Name] is licensed under Mulan PSL v2. + You can use this software according to the terms and conditions of the Mulan PSL v2. + You may obtain a copy of Mulan PSL v2 at: + http://license.coscl.org.cn/MulanPSL2 + THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. + See the Mulan PSL v2 for more details. + + + Mulan Permissive Software License,Version 2 + + Mulan Permissive Software License,Version 2 (Mulan PSL v2) + January 2020 http://license.coscl.org.cn/MulanPSL2 + + Your reproduction, use, modification and distribution of the Software shall be subject to Mulan PSL v2 (this License) with the following terms and conditions: + + 0. Definition + + Software means the program and related documents which are licensed under this License and comprise all Contribution(s). + + Contribution means the copyrightable work licensed by a particular Contributor under this License. + + Contributor means the Individual or Legal Entity who licenses its copyrightable work under this License. + + Legal Entity means the entity making a Contribution and all its Affiliates. + + Affiliates means entities that control, are controlled by, or are under common control with the acting entity under this License, ‘control’ means direct or indirect ownership of at least fifty percent (50%) of the voting power, capital or other securities of controlled or commonly controlled entity. + + 1. Grant of Copyright License + + Subject to the terms and conditions of this License, each Contributor hereby grants to you a perpetual, worldwide, royalty-free, non-exclusive, irrevocable copyright license to reproduce, use, modify, or distribute its Contribution, with modification or not. + + 2. Grant of Patent License + + Subject to the terms and conditions of this License, each Contributor hereby grants to you a perpetual, worldwide, royalty-free, non-exclusive, irrevocable (except for revocation under this Section) patent license to make, have made, use, offer for sale, sell, import or otherwise transfer its Contribution, where such patent license is only limited to the patent claims owned or controlled by such Contributor now or in future which will be necessarily infringed by its Contribution alone, or by combination of the Contribution with the Software to which the Contribution was contributed. The patent license shall not apply to any modification of the Contribution, and any other combination which includes the Contribution. If you or your Affiliates directly or indirectly institute patent litigation (including a cross claim or counterclaim in a litigation) or other patent enforcement activities against any individual or entity by alleging that the Software or any Contribution in it infringes patents, then any patent license granted to you under this License for the Software shall terminate as of the date such litigation or activity is filed or taken. + + 3. No Trademark License + + No trademark license is granted to use the trade names, trademarks, service marks, or product names of Contributor, except as required to fulfill notice requirements in Section 4. + + 4. Distribution Restriction + + You may distribute the Software in any medium with or without modification, whether in source or executable forms, provided that you provide recipients with a copy of this License and retain copyright, patent, trademark and disclaimer statements in the Software. + + 5. Disclaimer of Warranty and Limitation of Liability + + THE SOFTWARE AND CONTRIBUTION IN IT ARE PROVIDED WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL ANY CONTRIBUTOR OR COPYRIGHT HOLDER BE LIABLE TO YOU FOR ANY DAMAGES, INCLUDING, BUT NOT LIMITED TO ANY DIRECT, OR INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING FROM YOUR USE OR INABILITY TO USE THE SOFTWARE OR THE CONTRIBUTION IN IT, NO MATTER HOW IT’S CAUSED OR BASED ON WHICH LEGAL THEORY, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + + 6. Language + + THIS LICENSE IS WRITTEN IN BOTH CHINESE AND ENGLISH, AND THE CHINESE VERSION AND ENGLISH VERSION SHALL HAVE THE SAME LEGAL EFFECT. IN THE CASE OF DIVERGENCE BETWEEN THE CHINESE AND ENGLISH VERSIONS, THE CHINESE VERSION SHALL PREVAIL. + + END OF THE TERMS AND CONDITIONS + + How to Apply the Mulan Permissive Software License,Version 2 (Mulan PSL v2) to Your Software + + To apply the Mulan PSL v2 to your work, for easy identification by recipients, you are suggested to complete following three steps: + + i Fill in the blanks in following statement, including insert your software name, the year of the first publication of your software, and your name identified as the copyright owner; + + ii Create a file named “LICENSE” which contains the whole context of this License in the first directory of your software package; + + iii Attach the statement to the appropriate annotated syntax at the beginning of each source file. + + + Copyright (c) [Year] [name of copyright holder] + [Software Name] is licensed under Mulan PSL v2. + You can use this software according to the terms and conditions of the Mulan PSL v2. + You may obtain a copy of Mulan PSL v2 at: + http://license.coscl.org.cn/MulanPSL2 + THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. + See the Mulan PSL v2 for more details. \ No newline at end of file -- Gitee