diff --git a/README.md b/README.md index 07b48670..5ce00c4e 100644 --- a/README.md +++ b/README.md @@ -260,6 +260,7 @@ | 179 | [使用map对列做特征工程](./md/179.md) | pandas map | v1.0 | ⭐️⭐⭐ | | 180 | [category列转数值](./md/180.md) | pandas category | v1.0 | ⭐️⭐⭐ | | 181 | [rank排名](./md/181.md) | pandas rank | v1.0 | ⭐️⭐⭐ | +| 182 | [对数据下采样,调整小时步长为天](./md/182.md) | pandas resample | v1.0 | ⭐️⭐⭐ | ### Python 实战 diff --git a/img/182-1.jpg b/img/182-1.jpg new file mode 100644 index 00000000..73f93a33 Binary files /dev/null and b/img/182-1.jpg differ diff --git a/img/182-2.jpg b/img/182-2.jpg new file mode 100644 index 00000000..bafd8be6 Binary files /dev/null and b/img/182-2.jpg differ diff --git a/md/182.md b/md/182.md index f47d021c..27454527 100644 --- a/md/182.md +++ b/md/182.md @@ -1,9 +1,41 @@ ```markdown @author jackzhenguo -@desc +@desc 完成数据下采样,步长小时调整为天 @tag @version @date 2020/03/20 ``` - \ No newline at end of file + +182 如何完成数据下采样,调整步长由小时为天? + +步长为小时的时间序列数据,有没有小技巧,快速完成下采样,采集成按天的数据呢? +先生成测试数据: +```python +import pandas as pd +import numpy as np +``` + +```python +df = pd.DataFrame(np.random.randint(1,10,size=(240,3)), \ +columns = ['商品编码','商品销量','商品库存']) +``` + +```python +df.index = pd.util.testing.makeDateIndex(240,freq='H') +df +``` + +生成 240 行步长为小时间隔的数据: + +![](../img/182-1.png) + +小技巧,使用 resample 方法,合并为天(D) +```python +day_df = df.resample("D")["商品销量"].sum().to_frame() +day_df +``` + +结果如下,10行,240小时,正好为 10 days: + +![](../img/182-2.png)