All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Added
for new features.
Changed
for changes in existing functionality.
Deprecated
for soon-to-be removed features.
Removed
for now removed features.
Fixed
for any bug fixes.
Security
in case of vulnerabilities.
please add your unreleased change here.
- [component] add new io.data_sink component, it is used to export data to an external data source
- [component] add new psi_tp component, which is three party psi
- [component] add sql_processor component for handling SQL preprocessing
- [component] psi component add report for outputing the row nums
- [component] sf is split into two parts: sf without FL algorithms and sf_fl. sf-lite release contains sf, sf-full release contains both sf and sf_fl.
- [sgb] The label holder bucket sum now uses HEU calculation, removing the need for the Numba dependency
- [spu] bump spu version to 0.9.3.dev20241101
- [sgb] Fix sgb set params non-idempotent issue
- [component] IO component supports import and export sgb/glm model
- [component] Switch from ray to a local task scheduler
- [component] Support export SGD/GLM 2-Party HE model package
- [component] component reflect, include all component in the package of stats/io/preprocessing(exclude psi), and update the component version to 1.0.0
- [component] Integrate with DataProxy SDK
- [data] Change single party r2_score to sklearn function
- [docs] Security warning translation
- [sgb] Fix checkpoint prediction initialization
- [component] support sql null
- [component] io_write_data supports xgb
- [component] Add expr_condition_filter
- [component] PSI supports specifying party
- [component] fix training error on empty tree
- [component] support tweedie learning objective in SGB
- [component] update graph builder in model export
- [component] Add stats psi component
- [component] Add score card component
- [component] Add data sampling component
- [component] Add type cast component
- [component] Optimization GLM & LR training speed by using beaver cache
- [SLModel] Add attack method BLA
- [SLModel] Add defense method CAE
- [sgb] Support sample weight training
- [component] The online prediction of the GLM model has a large deviation compared with the offline prediction.
- [component] fix sgb export model sigmoid type not consistent with the offline prediction sigmoid type
- [component] fix sgb/xgb export model miss base score
- [component] pvalue support GLM
- [component] onehot add drop parameter to support first & mode
- [component] XGB/SGB support checkpoint
- [component] LR support report weights
- [component] Add union component
- [FLModel/SLModel] tf backend support custom loss
- [SLModel] Base model support additional loss
- [SLModel] Add attack method FSHA
- [SLModel] Add defense method MID
- [SLModel] Add defense method FedPass
- [Infra] Add arm64 build support
- SL: custom trainning step supported by lightning style base module.
- No longer providing x64 macOS binary packages.
- SL: grad_average is supported on GPU.
- SL: mix_up defence can be called through callback.
- Simulation: add some unpartitioned datasets.
- Fix order in componet load table.
- Serving: Update serving linkfunc type.
- FL: rename get/set_gradients methods
- SGX: improve doc for the accuracy analysis.
- SL: import fia attack.
- Add gradient_average callback in SL.
- Fix tuner default resource allocate.
- Boost the speed of autoattack benchmark.
- Upgrade to Python 3.10.
- Component: Suppport Early Stop in SGB via XGBoost Callbacks.
- Component: Add storage interface and oss/s3 storage impl.
- FLModel: Add new sample methods for datasets.
- Device Support cheetah h2a.
- Tuner optimizes resource allocation logic.
- Docs: Add SecretFlow Benchmark Results.
- Docs: Add more docs for SS-GLM.
- Use ray instead of secretflow-ray and bump to ray-2.9.
- Add Criteo(100M) Dataset.
- Add split learning nn component.
- Add sgb & ss_xgb model export.
- Split learning support prediction for file input.
- GPU support for split learning attack algorithm.
- Unify nn Callback naming.
-
Split learning in PyTorch supports GPU in debug mode.
-
Split learning in PyTorch supports pipeline strategy.
-
Sparse compression supports the compression of multi-dimensional data, such as images.
-
SLModel supports input in the form of file names and allows for multiple label inputs.
-
Component: prediction operator results can now save feature columns.
-
Component: binning operators can now view reports directly.
-
Component: binning rule modification is supported.
-
Component: support model exports, support both online and offline prediction.
- fix: bump rayfed to 0.2.0a18 to fix shutdown lock.
- Component: improve GLM training accuracy.
- Component: reduce GLM training memory and computation cost.
- Add a new feature of chunked computation to control the memory peak.
- The binning component supports displaying binning rules.
- Fix a bug that occurs when saving the SGB model fails when the number of trees in the model is 1.
- Fix bin rule report.
- sgb/ss_xgb/ss_glm/ss_sgd can save specified columns with result now.
- ss_glm: expose two params infeed size and newton.
- SplitLearning
- Add a defense method against data reconstruction attacks.
- Adapt replay attack and gradreplace attack for deepfm.
- Add expoit attack.
- Component: update fillna descriptions.
- Fix SS-GLM toturial.
- Add grad replace attack and replay attack.
- Add autoattack benchmark examples.
- Expose job_name param in sf.init.
- Bump rayfed version: optimizing Error Propagation and Capture.
- Component: woe_bins requires at least 5 bins to read.
- Component: add barrier_on_shutdown as sf cluster config.
- Add DataProxy binary writer.
- Add PSI v2 APIs.
- Add sl_deepfm_torch, sl_dnn_tf and sl_dnn_torch applications.
- Support sgb conversion into a single model.
- Support debug mode in tuner.
- Docs: Update installation and deployment docs.
- Docs: Add fl gpu docs.
- Docs: Add debug mode docs.
- Component: onehot_encode component limits size of col_rules.
- Component: sgb_predict component rm batch_size param.
- Fix pytorch_audio_classification example and tutorial.
- Fix bin substitution for incomplete rule.
- Fix sgb_predict component when all features come from a party.
- Make barrier_on_shutdown optional.
- Support SGB label holder without features.
- Support SL Model training on file data with mutiple labels.
- Add SL ResNet and VGG application.
- Secretflow ic: Add package interconnection protobuf files.
- Component: Add feature calculate component to generate new features by performing calculations on original features.
- Component: Support SGB prediction on big dataset.
- SGB optimize memory usage in prediction.
- Component: Bump groupby statistics version.
- Component: Improve translation.
- Component: Fix woe io and fillna.
- Add IO component including read, write and identity.
- Change groupby component to by-query style.
- Add secretflow tuner for automl and autoattack.
- Add IO component including read, write and identity.
- Change groupby component to by-query style.
- Support file data input in SLModel.
- Expose copts in SPU devices.
- Component: Add benchmarks.
- Add federated callback framework.
- Add new agg method -- concat,sum.
- Add ic_mode.
- Component: add upper_bound for max_group_size in groupby_statistics.
- Component: modify test_size and train_size restrictions in train_test_split.
- Clear legacy history class in FL.
- Kuscia adapter: check datasource only when author matches.
- Component: Support eq_range binning.
- Component: Support TLS in nsjail.
- Component: Add test framework.
- Component: Adapt to DataProxy.
- Component: Select features in binning.
- Component: biclassification_eval return nan values if min_item_cnt_per_bucket doesn't match.
- Enhance sgb.
- Component: Fix groupby_statistics
- Add debug mode.
- Support GLM model transition from MPC version to Federated version.
- FLModel supports PFL to allow custom aggregation logic on server.
- Add split learning applications: BST, MMoE.
- Add sparse, quantized and mixed compressor.
- Add polars backend for dataframe and SL model to enhance the data processing performance.
- Refactor data preprocessing module (VDataFrame, Partition).
- Doc: rearrange docs for split recommendation suite: SplitRec.
- Doc: Update PSI benchmark.
- Component: expose cross_silo_comm_backend option for secretflow init.
- Component: add vert_binning.
- Component: migrate to SecretFlow Open Specification.
- Support naive sl on torch backend.
- Add kwargs for custom strategy in sl.
- Add bst and mmoe in sl applications.
- Input attributes of some components are modified.
- Convert domain data to individual tables in preprocess_sf_node_eval_param of kuscia adapter.
- Secretflow support debug mode.
- Add vert binning for equal range bining method
- The data preprocessing module (VDataFrame, Partition) has been refactored, enhancing the data processing performance (primarily targeting the Polars backend).
- Fix error when flmodel with tf backend use gpu.
- Fix kuscia adapter
- FLModel supports PFL to allow custom aggregation logic on server.
- component: expose cross_silo_comm_backend option for secretflow init.
- SPU device: rm invalid use_link option.
- update psi benchmark.
- fix tls on kuscia.
- support GLM model transition from MPC version to Federated version.
- add sparse, quantized and mixed compressor.
- add polars backend for dataframe and SL model.
- add DisPFL adn VFGNN example.
- add cross_silo_comm_backend option to SFClusterConfig.RayFedConfig.
- fix dataset build in SL.
- SLModel: supports quantization compression algorithm, reducing communication volume by 2-4 times.
- SLModel: supports pipeline strategy, which can accelerate model training by 2-4 times in most scenarios.
- SLModel: PyTorch backend supports GPU.
- SLModel: introduces two attack and defense algorithms, LIA and FIA, for testing model security in the research and development stage.
- SLModel: supports a mode where one party only provides labels without providing features.
- Component: GLM train and predict components.
- Support the usage of brpc link as a backend for cross-silo communication.
- Component: add more parameters for SGB components.
- Stateful task for teeu
- docs: DeepFM translation
- Switch to shared workflow
- Add five papers in Vertical Federated Learning
- docs: update references on homomorphic encryption
- Add new quantized compressor method and tutorial
- PSI use psi_csv in psi comp.
- GLM train and predict components
- Support the usage of brpc link as a backend for cross-silo communication.
- Label inference attack v3
- SGB upgrade: use SGBFactory to replace SGB, update parameters and tutorials. SGB now supports more functionalities.
- Predict supports callbacks, call the callback function before/after prediction starts and after every step.
- SLModel fix bug in handling data with databuilder
- The FLModel solves the problem of the production mode hanging due to a small batch size.
- SLModel supports AggLayer
- SLModel(nn/deepfm)supports one party providing features and the other party providing labels.
- Component Specification and SecretFlow Component List v0.0.1.
- Bump spu to 0.4.1b0
- Fix logic error when sl base model load from none
- Split learning add application of deepfm for recommendation scenarios.
- Split learning add pytorch support.
- Add IO tutorials for federated learning.
- Reorg DP strategies
- SGB and XGB refactor. Add data checks. Improve qcut.
- Add the preview version of components.
- Bump spu to 0.3.3b2
- TEEU function serialization.
- Correct TEEU function serialization protocol and docs.
- Fix SPU compilation cache bug
- Bump spu to 0.3.3b2
- Add missing init files.
- TEE python Unit(TEEU) is introduced as the TEE cryptographic device and enables authorized computation with authorized data in TEE. TEEU brings more possibilities for hybrid computation.
- An experimental SecretFlow component design.
- SGB feature: add support for pre-pruning and model save & load.
- Use pytest instead of unittest.
- Bump spu to 0.3.3b0
- Bump heu to 0.4.3b2
- Fix hess lr auc err with large learning_rate.
- SecureBoost and its benchmark.
- Unbalanced PSI
- Two sub-protocols for generating cache and transmitting cache.
- Online with shuffling sub-protocol, supporting big data parties to obtain data.
- Semi-homomorphic encryption protocol - OU
- Bump spu to 0.3.2b12
- Fix psi_join_csv output columns error.
- Fix heu object decode.
- Bump spu to 0.3.2b11
- SCQL (Secure Collaborative Query Language).
- Bump spu to 0.3.2b9.
- Bump many dependencies for security fix.
- Bump RayFed to 0.1.1a.
- give min num_cpus for simulation.
- add init.py to sl tensorflow strategy folder.
- add party as resource label in simulation mode.
- add an option whether exit on cross-silo sending.
- put all requires in one file except dev.
- Fix psi_join_csv output columns error.
- fix heu object decode.
- Bump:
- rayfed to 0.1.0b0.
- spu to 0.3.1b9.
- sf-heu to 0.3.2b1.
- add export_model api for SLModel
- add get_params api for preprocessing
- VDataFrame read_csv uses spu.psi_csv instead of spu.psi_df to reduce memory usage.
- Bump rayfed to 0.1.0a10.
- Control the concurrency of ray task in SLModel fit/predict/evaluate.
- psi join multi-key sort command parameters
- Optimize psi join memory usage.
- Bump rayfed to 0.1.0a9.
- SS GLM distribution type err.
- GLM.
- bump rayfed to 0.1.0a8.
- SSXgb for user_specified_num_returns=1.
- SLModel sets steps_per_epoch for worker.
- set num_cpu and resources only when start ray with local mode.
- Multi controller based on rayfed.
- Rewrite data builder.
- Bump to yacl.
- utils.testing.cluster_def supports NPC.
- FLModel save path.
- SSXgb col sub error.
- HomeXgb callback.
- Compress mask.
- spu.call fails when SPUCompilerNumReturnsPolicy is FROM_USER and user_specified_num_returns is 1.
- OneHotEncoder remove PYUObject property.
- Add sl_model metrics wrap to fix format error.
- Add Finetune and FedEval to SFXgboost
- Add SLModel support multi parties(>=2)
- FLModel supports most metrics of regression and classification
- SLModel can be initialized without model.
- PSI doc typos.
- Fix the logic error bug of savemodel and loadmodel path in SLModel and FLModel
- Add scorecard.
- Add replace/mode function to DataFrame.
- Add round function to VDataFrame.
- Add psi_join_csv and psi_join_df.
- Add preprocessing.LogroundTransformer.
- Add args to preprocessing.OneHotEncoder.
- Bump dependencies
- secretflow-ray to 2.0.0.dev2
- Update psi_df doc.
- Optimize sl_model by tf_funciton.
- Add curve parameter for ecdh psi.
- Protect biclassification, psi and pva with pyu object.
- Modify XgbModel predict api.
- Raise exception if spu_fe.compile fails.
- Fix quantile security vulnerability.
- Fix woe bin bugs.
- Fix psi_join recv timeout.
- omp_num_threads param for secretflow init().
- Regression and biclassification evaluation.
- Xgboost evaluation.
- Horizontal fl supports default naive aggreagte for metrics.
- PVA calculation.
- Remove graph util NodeDataLoader.
- VDataFrame docstring.
- Remove dependencies
- dgl
- Get rid of import tensorflow/torch when import secretflow.
- Add license file
- Fix sl predict & remove reveal
- Fix typos in function docs.
- Bump dependencies
- TensorFlow to 2.10.0
- Jax to 0.3.17
- Jaxlib to 0.3.15
- Bump dependencies
- sf-heu to 0.2.0
- spu to 0.2.5
- Missing requirements in dev-requirements.txt.
- SPU config param: throttle_window_size
- PSI param: bucket_size
- SPU config param http_timeout_ms defaults to 120s.
- Bump dependencies
- sf-heu to 0.1.3.2
- Add pytorch backend for fl model for classification
- Add FL Dp strategy
- Update document
- Shrink docker image size
- Bump dependencies
- sf-heu to 0.1.3.1
- Use secretflow-ray instead of ray.
- Add steps_per_epoch parameter to callback function of SLBaseTFModel.
- Bump dependencies
- sf-heu to 0.1.2
- Remove example of mixlr as mixlr is in official code already.
- Fix psi docs.
- Pytorch backend and FL strategy.
- SS pvalue.
- Horizontal NN global DP with RDP accountant.
- HEU supports encrypt with audit log.
- HEU uses c++ numpy api.
- Update ant pypi address.
- Complete cluster model deployment doc.
- Use multiprocess.cpu_count instead of multiprocessing.cpu_count for compatibility with macOS.
- Fix sl gnn test.
- fix model handle_data parties_length by adding partition_shape to dataframe
- SS VIF.
- FL strategy: FedProx.
- Split GNN.
- Remove duplicated shape_spu_to_np & dtype_spu_to_np in spu.py.
- Development and release docker.
- FL model strategy.
- Sigmoid approximation in python.
- SS LR.
- Verical FL LR.
- Auto ray.get for nested params with pyu objects in proxy decorated cls.
- Link desc in spu construction.
- Refactor datasets from oss instead of lfs.
- Many doc improvements.
- SecureAggregator
average
when weights are multi-dimensions.
- Vertical dp.
- Many docs improvements.
- Increase H2A mask bits
- Include c++ lib in setup.
- simulation.dataset for tutorial
- update tutorial of FL SL & SFXgboost
- add csv stream reader for FL
- fix sf.init argument.
- typos, grammatical errors, implicit in docs.
- Secretflow shutdown.
- LabelEncoder returns np.int dtype.
- FlModel supports csv loader.
- Rename PPU to SPU.
- MixLR demo.
- DP on split learning DNN.
- XGBoost tutorial.
- DataFrame and FedNdarray support astype method.
- Use lfs instead of http file.
- FL model requires model define no more when using load_model.
- dataframe.to_csv returns object ref.
- Use more secure random.
- Complete security and not-for-production warning.
- SplitDNN dp.
- Use lfs.
- XGboost tutorial.
- DataFrame.to_csv returns object refs for further wait.
- Runtime_config as input to utils.testing.cluster_def.
- SFXgboost optimization.
- Horizontal preprocessing
- StardardScaler
- KBinsDiscretizater
- Binning
- Vertical preprocessing: WOE binning and substitution.
- HEU supports int/fxp data type.
- HEU Object supports slice and sum.
- Differential privacy.
- API docstrings.
- Custom pytree node to ppu.
- English docs.
- Remove default reveal in aggreagtor and compartor.
- Bump dependencies
- jax to 0.3.7
- sf-heu to 0.0.5
- sf-ppu to 0.0.10.4
- FL model: up early stop from step to epoch.
- SecureAggregation uses powers of 2.
- Rename vdf partitions_dimensions to partition_columns.
- Use *args instead of args in aggregation for reducing ray task dependency.
- FL model progress bug.
- train_test_split typo.
- Fix PPU dtype mismatch caused by JAX 32bit mode.
- Vertical PearsonR.
- FlModel/SlModel support model path dict.
- Upgrade sf-ppu version to 0.0.7.1
- More perfect HEU
- Split learning benchmark model
- SFXgboost for homo xgboost training
- FL: different batch size for different clients.
- Wait method for pyu objects.
- FLModel evaluate returns detailed metrics.
- PPU listen address.