【对比Python】按指定次序分组
任务:
按次序列出Technology,Production,Sales,HR的女员工人数和平均年龄
按次序列出Technology,Production,Sales,HR的女员工人数和平均年龄
Python
1 | import pandas as pd |
2 | import datetime |
3 | emp_file = 'E:\\txt\\employee.txt' |
4 | dept_seq = ['Technology','Production','Sales','HR'] |
5 | emp_info = pd.read_csv(emp_file,sep='\t') |
6 | emp_g = emp_info.groupby(by='DEPT') |
7 | emp_g_index = [index for index,group in emp_g] |
8 | today = datetime.datetime.today().year |
9 | dept_femal_list = [] |
10 | for dept in dept_seq: |
11 | if dept not in emp_g_index: |
12 | dept_femal_num = 0 |
13 | age_a = None |
14 | else: |
15 | dept_emp = emp_g.get_group(dept) |
16 | dept_emp_femal = dept_emp[dept_emp['GENDER']=='F'] |
17 | dept_femal_num = len(dept_emp_femal) |
18 | arr = pd.to_datetime(dept_emp_femal['BIRTHDAY']) |
19 | age_a = (today-arr.dt.year).mean() |
20 | dept_femal_list.append([dept,dept_femal_num,age_a]) |
21 | dept_femal_info = pd.DataFrame(dept_femal_list,columns=['DEPT','femal_num','femal_age']) |
22 | print(dept_femal_info) |
集算器
A | |
1 | E:\\txt\\employee.txt |
2 | [Technology,Production,Sales,HR] |
3 | =file(A1).import@t() |
4 | =A3.align@a(A2,DEPT) |
5 | =A4.new(A2(#):DEPT,(F_EMP=~.select(GENDER:"F"),F_EMP.count()):Femal_num,F_EMP.avg(age@y(BIRTHDAY)):Femal_age) |
以有序集合为基础的集算器加上特有的对齐运算处理这类计算就方便多了。