当接收到销售“主表”以及用于检索京东买家姓名的“人名对照表”时,立即激活本技能。Agent 将作为极其严谨的数据清洗专家,在内存中调度类似 Excel 的 VLOOKUP 检索与多表关联逻辑,严格执行以下定义的 16 列重构、条件清洗、字号规范、复合排序与全行色彩渲染规则。
🚨 终极铁律:绝对不能无中生有!如出现 Agent 无法完成或缺少源数据的情况,对应单元格必须留白(全空格或保持空单元格),严禁胡乱捏造。
\---
= 或者整个文本被双引号 " 包裹(例如:="330174..." 或 "焦"),Agent 必须自动将它们剥离,只留下中间最纯净的明文文本,然后再进行后续的逻辑拼接、条件判断与公式生成。\---
\---
生成的 Excel 必须包含以下 16 列,行表头文字必须完全一致,展示时不思考含义,仅作表头:
\["日期", "省份", "区域", "客户", "餐饮配料", "直销", "品种", "规格", "香型", "数量", "含税单价", "含税金额", "不含税金额", "备注", "订单号", "所属公司"]
\---
湖南;“广西壮族自治区 ...” ➡️ 广西;“北京市 ...” ➡️ 北京。东北华北 (注:遇到安徽必须输出华北)西北华中西南华东华南广东湖南北京、上海、重庆、天津),绝对禁止拼接二级城市或区县名,从源头上彻底阻断出现“北京北京3305...”等重复地名的现象!省份/自治州 + 地级市。清洗逻辑具备严格的先后顺序:①【关键剥离顺序】:必须优先检查并剥离末尾的“地区”(两个字)后缀!确认不包含“地区”后,再剥离“市”、“县”、“区”、“盟”等单字行政后缀。严禁先切单字。(例如:“新疆塔城地区”,必须先匹配并切除“地区”变为“新疆塔城”;严禁因先切除“区”字而错误残留为“新疆塔城地”)。
② 移除“哈尼族”、“土家族”、“苗族”、“彝族”、“壮族”、“自治州”等所有民族和自治谓称。
③ 安全兜底防御:针对“杭州”、“温州”、“广州”等本身自带“州”字的城市,如果去掉“州”字会导致核心名变成单字(如 杭、温、广),必须强制保留“州”字。只有当去掉“州”字后剩余文本长度 >= 2 时(如“红河州”->“红河”),才允许切除。
浙江杭州;云南省 红河哈尼族彝族自治州 ➡️ 云南红河;广西壮族自治区 桂林市 ➡️ 广西桂林;新疆维吾尔自治区 塔城地区 ➡️ 新疆塔城。地理前缀 + 检索到的人名(例如:浙江杭州张三、北京李四)。若对照表中无此单号或姓名为空,则人名留空,仅保留地理前缀。地理前缀 + 子单原始单号(例如:云南红河5116035746666059715、北京3305519328451029988)。餐饮配料。【字号:强制 11 号】直销。【字号:强制 11 号】盐焗粉、浓缩鲜香粉、烧腊香味素、去腥增香调味油、凉拌菜调味油 之一的 ➡️ 统一输出:调味品I+G呋喃酮MCP甲基香兰素乙基12KG箱1kg/瓶0.5kg/瓶5KG箱10大小箱箱"250克"、"箱"、"瓶"、"桶"、"500g"、"1kg"、"30g\3" 关键字予以剔除并剥离。(示例*:“250克纯瓶”转换为 纯;“焦箱”转换为 焦`)。【字号:12号】主表【单品支付金额】 / 主表【实发数量】。【字号:12号】=J{行号}\K{行号}(例如第12725行,单元格公式为 =J12725\K12725)。【字号:强制 11 号】=L{行号}/1.13(例如第12690行,单元格公式为 =L12690/1.13)。重中之重:针对该整列单元格,必须在 Excel 中将单元格格式设置为“数值”格式,并严格保留 2 位小数。 【字号:强制 11 号】淘宝、天猫、京东、拼多多、抖音 这五个标准核心核心词。肇庆香料华宝星湖\---
调味品 的行,必须强制集中排在当前店铺区块的最前面。不是“乙基” 的所有其他行(如香兰素、I+G、MCP、甲基等)。乙基 的行,必须全部排在当前店铺区块的最后面。必须对整行(从 A 列到 P 列,包括没有任何文字的绝对空白单元格)进行无死角底色填充:
#FFFF00#92D050#00B0F0#ED7D31\---
若 Agent 在执行过程中调用 Python 进行物理建表,已将切词逻辑完美修正(多字后缀在前,单字在后):
import re
import os
import pandas as pd
from openpyxl import Workbook
from openpyxl.styles import Alignment, PatternFill, Font
def clean\_shell(val):
"""全局自动脱壳清洗工序"""
if pd.isna(val): return ""
val\_str = str(val).strip()
if val\_str.startswith('='):
val\_str = val\_str.lstrip('=').strip('"')
if val\_str.startswith('"') and val\_str.endswith('"'):
val\_str = val\_str.strip('"')
return val\_str.strip()
def process\_rebuild\_engine(main\_table\_path, name\_table\_path, output\_path):
df\_main = pd.read\_csv(main\_table\_path) if main\_table\_path.endswith('.csv') else pd.read\_excel(main\_table\_path)
# 建立京东人名 VLOOKUP 字典
vlookup\_dict = {}
if name\_table\_path and os.path.exists(name\_table\_path):
df\_name = pd.read\_csv(name\_table\_path) if name\_table\_path.endswith('.csv') else pd.read\_excel(name\_table\_path)
id\_col = \[c for c in df\_name.columns if '单号' in str(c) or '订单' in str(c)]\[0]
name\_col = \[c for c in df\_name.columns if any(x in str(c) for x in \['人', '名', '姓名'])]\[0]
for \_, n\_row in df\_name.iterrows():
k = clean\_shell(n\_row.get(id\_col, ''))
v = clean\_shell(n\_row.get(name\_col, ''))
if k: vlookup\_dict\[k] = v
region\_rules = {
'东北': \['吉林', '黑龙', '辽宁'],
'华北': \['北京', '天津', '安徽', '山东', '内蒙', '河北'],
'西北': \['陕西', '甘肃', '宁夏', '河南', '青海', '山西'],
'华中': \['浙江', '福建', '江西'],
'西南': \['云南', '贵州', '四川', '重庆', '广西', '湖北'],
'华东': \['江苏', '上海'],
'华南': \['新疆', '西藏', '海南'],
'广东': \['广东'], '湖南': \['湖南']
}
parsed\_rows = \[]
for \_, row in df\_main.iterrows():
shop\_raw = clean\_shell(row.get('店铺', ''))
combo = clean\_shell(row.get('拆自组合装', ''))
addr\_raw = clean\_shell(row.get('收货地区', ''))
order\_id = clean\_shell(row.get('子单原始单号', ''))
qty = clean\_shell(row.get('实发数量', ''))
pay\_amt = clean\_shell(row.get('单品支付金额', ''))
remark = ""
for p in \['淘宝', '天猫', '京东', '拼多多', '抖音']:
if p in shop\_raw: remark = p; break
if not remark: continue
addr\_parts = addr\_raw.split()
prov\_raw = addr\_parts\[0] if len(addr\_parts) > 0 else ""
city\_raw = addr\_parts\[1] if len(addr\_parts) > 1 else ""
prov = prov\_raw.replace("省", "").replace("自治区", "").replace("市", "")
if "内蒙古" in prov: prov = "内蒙古"
elif "广西" in prov: prov = "广西"
elif "西藏" in prov: prov = "西藏"
elif "新疆" in prov: prov = "新疆"
elif "宁夏" in prov: prov = "宁夏"
# 🚨 直辖市全切断防重复前缀拦截
if prov in \["北京", "上海", "天津", "重庆"]:
geo\_prefix = prov
else:
city = city\_raw
# 🚨 完美修复:调整执行顺序,必须先判定长词“地区”,再判定单字“市县区盟”!
if city.endswith('地区'):
city = city\[:-2]
elif city.endswith(('市', '县', '区', '盟')):
city = city\[:-1]
city = re.sub(r'(哈尼族|土家族|苗族|彝族|藏族|回族|白族|傣族|景泼族|壮族|侗族|布依族|满族|蒙古族|哈萨克族|柯尔克孜族|羌族|各族|自治州|自治县)+', '', city)
if city.endswith('州') and len(city) > 2: city = city\[:-1]
geo\_prefix = f"{prov}{city}" if city else prov
region = ""
for reg, keywords in region\_rules.items():
if any(k in prov for k in keywords): region = reg; break
if remark == "京东":
person\_name = vlookup\_dict.get(order\_id, "")
customer = f"{geo\_prefix}{person\_name}"
else:
customer = f"{geo\_prefix}{order\_id}"
# 品种逻辑判定
if any(x in combo for x in \["盐焗粉", "浓缩鲜香粉", "烧腊香味素", "去腥增香调味油", "凉拌菜调味油"]): breed = "调味品"
elif "I+G" in combo: breed = "I+G"
elif "呋喃酮" in combo: breed = "呋喃酮"
elif "甲基环戊烯醇酮" in combo or "MCP" in combo: breed = "MCP"
elif "甲基" in combo: breed = "甲基"
elif "香兰素" in combo: breed = "香兰素"
else: breed = "乙基"
# 规格逻辑判定
if "30g\*3盐焗粉" in combo: spec = "12KG箱"
elif any(x in combo for x in \["1kg凉拌菜调味油", "1kg去腥增香调味油"]): spec = "1kg/瓶"
elif any(x in combo for x in \["500g凉拌菜调味油", "500g去腥增香调味油"]): spec = "0.5kg/瓶"
elif "250克" in combo: spec = "5KG箱"
elif any(x in combo for x in \["焦桶", "纯桶", "特醇桶", "焦糖桶"]): spec = "10大小"
else:
spec\_box = \["焦","纯","特醇","I+G","香兰素","呋喃酮","甲基","焦箱","纯箱","特醇箱","甲基环戊烯醇酮","香虎粉","焦糖香瓶"]
spec = "箱" if any(x in combo for x in spec\_box) else "箱"
try: unit\_price = float(pay\_amt) / float(qty) if float(qty) != 0 else ""
except: unit\_price = ""
company = "肇庆香料" if remark == "淘宝" else "华宝星湖"
parsed\_rows.append({
'日期': ' ', '省份': prov, '区域': region, '客户': customer, '餐饮配料': '餐饮配料', '直销': '直销',
'品种': breed, '规格': spec, 'combo\_raw': combo, '数量': qty, '含税单价': unit\_price,
'备注': remark, '订单号': order\_id, '所属公司': company
})
res\_df = pd.DataFrame(parsed\_rows)
if res\_df.empty: return
# 倒数第二步:批量剥离香型关键字
removals = \["250克", "箱", "瓶", "桶", "500g", "1kg", "30g\*3"]
def clean\_flavor(text):
for word in removals: text = text.replace(word, "")
return text
res\_df\['香型'] = res\_df\['combo\_raw'].apply(clean\_flavor)
res\_df = res\_df.drop(columns=\['combo\_raw'])
# 店铺与多级置顶排序权重计算
shop\_weight = {'淘宝': 1, '天猫': 2, '京东': 3, '拼多多': 4, '抖音': 5}
res\_df\['shop\_w'] = res\_df\['备注'].map(shop\_weight)
res\_df\['breed\_w'] = res\_df\['品种'].apply(lambda b: 0 if b == '调味品' else (1 if b != '乙基' else 2))
res\_df = res\_df.sort\_values(by=\['shop\_w', 'breed\_w']).reset\_index(drop=True)
res\_df = res\_df.drop(columns=\['shop\_w', 'breed\_w'])
# 生成 Excel 并渲染
wb = Workbook()
ws = wb.active
ws.title = "重构账目表"
headers = \["日期", "省份", "区域", "客户", "餐饮配料", "直销", "品种", "规格", "香型", "数量", "含税单价", "含税金额", "不含税金额", "备注", "订单号", "所属公司"]
ws.append(headers)
fills = {
'天猫': PatternFill(start\_color="FFFF00", end\_color="FFFF00", fill\_type="solid"),
'京东': PatternFill(start\_color="92D050", end\_color="92D050", fill\_type="solid"),
'拼多多': PatternFill(start\_color="00B0F0", end\_color="00B0F0", fill\_type="solid"),
'抖音': PatternFill(start\_color="ED7D31", end\_color="ED7D31", fill\_type="solid")
}
center\_align = Alignment(horizontal="center", vertical="center")
font\_12 = Font(size=12)
font\_11 = Font(size=11)
for col\_idx in range(1, 17):
ws.cell(row=1, column=col\_idx).font = font\_12
ws.cell(row=1, column=col\_idx).alignment = center\_align
for i, r in res\_df.iterrows():
row\_idx = i + 2
row\_data = \[
r\['日期'], r\['省份'], r\['区域'], r\['客户'], r\['餐饮配料'], r\['直销'],
r\['品种'], r\['规格'], r\['香型'], r\['数量'], r\['含税单价'],
f"=J{row\_idx}\*K{row\_idx}", f"=L{row\_idx}/1.13",
r\['备注'], r\['订单号'], r\['所属公司']
]
ws.append(row\_data)
row\_fill = fills.get(r\['备注'], None)
# 🚨 对整行(包含空白列)强制进行颜色填充与字号分级
for col\_idx in range(1, 17):
cell = ws.cell(row=row\_idx, column=col\_idx)
cell.alignment = center\_align
if row\_fill: cell.fill = row\_fill
cell.font = font\_11 if col\_idx in \[5, 6, 12, 13] else font\_12
if col\_idx == 13: cell.number\_format = '0.00'
wb.save(output\_path)
共 10 个版本