【对比Python】可重复的条件分组

任务:按在公司的工龄将员工分段分组统计每组的男女工人数

Python

1 import pandas as pd
2 import datetime
3 def eval_g(dd:dict,ss:str):
4     return eval(ss,dd)   
5 emp_file = 'E:\\txt\\employee.txt'
6 emp_info = pd.read_csv(emp_file,sep='\t')
7 employed_list = ['Within five years','Five to ten years','More than ten years','Over fifteen years']
8 employed_str_list = ["(s<5)","(s>=5) & (s<10)","(s>=10)","(s>=15)"]
9 today = datetime.datetime.today().year
10 arr = pd.to_datetime(emp_info['HIREDATE'])
11 employed = today-arr.dt.year
12 emp_info['EMPLOYED']=employed
13 dd = {'s':emp_info['EMPLOYED']}
14 group_cond = []
15 for n in range(len(employed_str_list)):
16     emp_g = emp_info.groupby(eval_g(dd,employed_str_list[n]))
17     emp_g_index = [index for index in emp_g.size().index]
18     if True not in emp_g_index:
19         female_emp=0
20         male_emp=0
21     else:
22         group = emp_g.get_group(True)
23         sum_emp = len(group)
24         female_emp = len(group[group['GENDER']=='F'])
25         male_emp = sum_emp-female_emp
26     group_cond.append([employed_list[n],male_emp,female_emp])
27 group_df = pd.DataFrame(group_cond,columns=['EMPLOYED','MALE','FEMALE'])
28 print(group_df)

Pandas没有现成的重复条件分组的函数,所以只能按照条件重新分组,取到满足条件的分组。

集算器

  A B
1 ?<5 Within five years
2 ?>=5 && ?<10 Five to ten years
3 ?>=10 More than ten years
4 ?>=15 Over fifteen years
5 E:\\txt\\employee.txt  
6 =[A1:A4] =A6.concat@c()
7 =file(A5).import@t() =A7.derive(age@y(HIREDATE):EMPLOYED)
8 =B7.enum@r(A6,EMPLOYED) =[B1:B4]
9 =A8.new(B8(#):EMPLOYED,~.count(GENDER=="M"):MALE,~.count(GENDER=="F"):FEMAL)  

集算器有强大的枚举分组功能,可以轻松实现重复的条件分组。