efficient-inference

[NeurIPS 2024 Spotlight]"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS", Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang

3d-reconstruction efficient-inference gaussian-splatting nurips neurips-2024

Updated Oct 12, 2024
Python

liuzhuang13 / slimming

Star

Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.

deep-learning convolutional-neural-networks efficient-inference

Updated Jul 14, 2019
Lua

Zhen-Dong / Awesome-Quantization-Papers

Star

List of papers related to neural network quantization in recent AI conferences and journals.

neural-networks awesome-list papers quantization model-compression edge-computing efficient-inference diffusion-models large-language-models

Updated Sep 22, 2024

SqueezeAILab / KVQuant

Star

[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

natural-language-processing compression text-generation transformer llama quantization mistral model-compression efficient-inference efficient-model large-language-models llm small-models localllm localllama

Updated Aug 13, 2024
Python

The-Learning-And-Vision-Atelier-LAVA / SMSR

Star

[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference

sparsity super-resolution efficient-inference

Updated Oct 18, 2021
Python

changlin31 / DS-Net

Star

(CVPR 2021, Oral) Dynamic Slimmable Network

pruning model-compression efficient-inference dynamic-networks network-pruning dynamic-pruning

Updated Dec 31, 2021
Python

lucidrains / speculative-decoding

Star

Explorations into some recent techniques surrounding speculative decoding

deep-learning transformers artificial-intelligence efficient-inference

Updated Oct 9, 2023
Python

xindongzhang / ELAN

Star

[ECCV2022] Efficient Long-Range Attention Network for Image Super-resolution

transformer super-resolution efficient-inference

Updated Jul 20, 2022
Python

liuziwei7 / mobile-id

Star

Deep Face Model Compression

computer-vision deep-learning face-recognition model-compression efficient-inference

Updated Aug 21, 2018
MATLAB

Picovoice / picollm

Star

On-device LLM Inference Powered by X-Bit Quantization

natural-language-processing compression self-hosted llama language-models quantization language-model gemma mistral model-compression efficient-inference llm llms generative-ai large-language-model llm-inference llama2 mixtral llama3

Updated Nov 14, 2024
Python

cure-lab / DeciWatch

Star

[ECCV 2022] Official implementation of the paper "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

deep-learning efficiency pytorch human-pose-estimation pose-estimation eccv efficient-inference 2d-human-pose 3d-pose-estimation efficient-neural-networks body-reconstruction eccv2022 3d-body-recovery

Updated Jul 19, 2022
Python

czg1225 / AsyncDiff

Star

[NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

distributed-computing text-to-image efficient-inference diffusion-models text-to-video inference-acceleration stable-diffusion training-free

Updated Sep 27, 2024
Python

RAIVNLab / STR

Star

Soft Threshold Weight Reparameterization for Learnable Sparsity

sparsity cnn imagenet str icml efficient-inference soft-thresholding edge-machine-learning sparsity-optimization resource-efficient icml-2020 learnable-sparsity icml2020 soft-threshold-reparameterization

Updated Feb 15, 2023
Python

kssteven418 / BigLittleDecoder

Star

[NeurIPS'23] Speculative Decoding with Big Little Decoder

decoding efficient-inference speculative-execution fast-inference llm speculative-decoding

Updated Feb 6, 2024
Python

Improve this page

Add a description, image, and links to the efficient-inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the efficient-inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

efficient-inference

Here are 66 public repositories matching this topic...

huawei-noah / Efficient-AI-Backbones

SqueezeAILab / LLMCompiler

snap-research / EfficientFormer

huawei-noah / AdderNet

horseee / DeepCache

SqueezeAILab / SqueezeLLM

VITA-Group / LightGaussian

liuzhuang13 / slimming

Zhen-Dong / Awesome-Quantization-Papers

SqueezeAILab / KVQuant

The-Learning-And-Vision-Atelier-LAVA / SMSR

changlin31 / DS-Net

lucidrains / speculative-decoding

xindongzhang / ELAN

liuziwei7 / mobile-id

Picovoice / picollm

cure-lab / DeciWatch

czg1225 / AsyncDiff

RAIVNLab / STR

kssteven418 / BigLittleDecoder

Improve this page

Add this topic to your repo