# PySyft **Repository Path**: boothua-cloud/PySyft ## Basic Information - **Project Name**: PySyft - **Description**: PySyft国内映射 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: dev - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 2 - **Forks**: 0 - **Created**: 2022-11-07 - **Last Updated**: 2023-11-17 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
Perform `numpy`-like analysis on `data` that remains in `someone else's` server
1. Install our handy 🛵 cli tool which makes deploying a Domain or Network server a one-liner:
`pip install -U hagrid`
2. Then run our interactive jupyter Install 🧙🏽♂️ WizardBETA:
`hagrid quickstart`
- In the tutorial you will learn how to install and deploy:
`PySyft` = our `numpy`-like 🐍 Python library for computing on `private data` in someone else's `Domain`
`PyGrid` = our 🐳 `docker` / `k8s` / 🐧 `vm` `Domain` & `Network` Servers where `private data` lives
- During quickstart we will deploy `PyGrid` to localhost with 🐳 `docker`, however 🛵 HAGrid can deploy to `k8s` or a 🐧 `ubuntu` VM on `azure` / `gcp` / `ANY_IP_ADDRESS` by using 🔨 `ansible`†
3. Read our 📚 Docs
4. Ask Questions ❔ in `#support` on Slack
# Install Notes
- HAGrid Requires: 🐍 `python` 🐙 `git` - Run: `pip install -U hagrid`
- Interactive Install 🧙🏽♂️ WizardBETA Requires 🛵 `hagrid`: - Run: `hagrid quickstart`
†`Windows` does not support `ansible`, preventing some remote deployment targets
- PySyft Requires: 🐍 `python 3.7+` - Run: `pip install -U syft --pre`
\*`macOS` Apple Silicon users need cmake: `brew install cmake`
‡`Windows` users must run this first: `pip install jaxlib==0.3.14 -f https://whls.blob.core.windows.net/unstable/index.html`
- PyGrid Requires: 🐳 `docker` / `k8s` or 🐧 `ubuntu` VM - Run: `hagrid launch ...`
# Versions
`0.7.0 beta` - `dev` branch 👈🏽
`0.6.0` - Course 3
`0.5.1` - Course 2 + M1 Hotfix
`0.2.0` - `0.5.0` Deprecated
PySyft and PyGrid use the same `version` and its best to match them up where possible. We release weekly betas which can be used in each context:
PySyft: `pip install -U syft --pre`
PyGrid: `hagrid launch ... tag=latest`
HAGrid is a cli / deployment tool so the latest version of `hagrid` is usually the best.
# What is Syft?
`Syft` is OpenMined's `open source` stack that provides `secure` and `private` Data Science in Python. Syft decouples `private data` from model training, using techniques like [Federated Learning](https://ai.googleblog.com/2017/04/federated-learning-collaborative.html), [Differential Privacy](https://en.wikipedia.org/wiki/Differential_privacy), and [Encrypted Computation](https://en.wikipedia.org/wiki/Homomorphic_encryption). This is done with a `numpy`-like interface and integration with `Deep Learning` frameworks, so that you as a `Data Scientist` can maintain your current workflow while using these new `privacy-enhancing techniques`.
### Why should I use Syft?
`Syft` allows a `Data Scientist` to ask `questions` about a `dataset` and, within `privacy limits` set by the `data owner`, get `answers` to those `questions`, all without obtaining a `copy` of the data itself. We call this process `Remote Data Science`. It means in a wide variety of `domains` across society, the current `risks` of sharing information (`copying` data) with someone such as, privacy invasion, IP theft and blackmail will no longer prevent the vast `benefits` such as innovation, insights and scientific discovery which secure access will provide.
No more cold calls to get `access` to a dataset. No more weeks of `wait times` to get a `result` on your `query`. It also means `1000x more data` in every domain. PySyft opens the doors to a streamlined Data Scientist `workflow`, all with the individual's `privacy` at its heart.
# Tutorials
Data Owner |
Data Scientist |
Data Engineer |
|---|---|---|
| - Deploy a Domain Server - Upload Private Data - Create Accounts - Manage Privacy Budget - Join a Network - Learn how PETs streamline Data Policies | - Install Syft - Connect to a Domain - Search for Datasets - Train Models - Retrieve Secure Results - Learn Differential Privacy | - Setup Dev Mode - Deploy to Azure - Deploy to GCP - Deploy to Kubernetes - Customize Networking - Modify PyGrid UI |
|
👨🏻💼 Data Owners |
👩🏽🔬 Data Scientists |
|---|---|
| Provide `datasets` which they would like to make available for `study` by an `outside party` they may or may not `fully trust` has good intentions. | Are end `users` who desire to perform `computations` or `answer` a specific `question` using one or more data owners' `datasets`. |
|
🏰 Domain Server |
🔗 Network Server |
| Manages the `remote study` of the data by a `Data Scientist` and allows the `Data Owner` to manage the `data` and control the `privacy guarantees` of the subjects under study. It also acts as a `gatekeeper` for the `Data Scientist's` access to the data to compute and experiment with the results. | Provides services to a group of `Data Owners` and `Data Scientists`, such as dataset `search` and bulk `project approval` (legal / technical) to participate in a project. A network server acts as a bridge between it's members (`Domains`) and their subscribers (`Data Scientists`) and can provide access to a collection of `domains` at once. |
|
|
|
|
|---|
# Supporters
|
|
|
|
|
|
|
|
|
|---|
# Disclaimer
Syft is under active development and is not yet ready for pilots on private data without our assistance. As early access participants, please contact us via [Slack](https://slack.openmined.org/) or email if you would like to ask a question or have a use case that you would like to discuss.
# License
[Apache License 2.0](LICENSE)