Training DeepFM under PAI-Notebook

Training DeepFM under PAI-Notebook

It should be said that DeepFM is one of the most common CTR prediction models at present. For a recommendation system based on CTR estimation, the most important thing is to learn the combination of features behind user click behavior.In different recommended scenarios, low-order or high-order combinatorial features may all have an impact on the final CTR.Width models (LR/FM/FFM) generally only learn first-order and second-order feature combinations, while depth models (FNN/PNN) generally learn higher-order feature combinations.The DeepFM model takes both into account. First, review the model structure of DeepFM:

As shown in the figure, DeepFM consists of two parts: the Factor Decomposition Machine (FM) part and the Neural Network part (DNN), which are responsible for the extraction of low-order and high-order features, respectively.These two parts share the same embedded layer input.DeepFM predictions can be written as

data set

Let's start with the following dataset as an example

import pandas as pd 
 
  
TRAIN_FILE = "data/train.csv" 
TEST_FILE = "data/test.csv" 
 
  
NUMERIC_COLS = [  
    "ps_reg_01", "ps_reg_02", "ps_reg_03",  
    "ps_car_12", "ps_car_13", "ps_car_14", "ps_car_15"  
] 
 
  
IGNORE_COLS = [ 
    "id", "target", 
    "ps_calc_01", "ps_calc_02", "ps_calc_03", "ps_calc_04", 
    "ps_calc_05", "ps_calc_06", "ps_calc_07", "ps_calc_08", 
    "ps_calc_09", "ps_calc_10", "ps_calc_11", "ps_calc_12", 
    "ps_calc_13", "ps_calc_14", 
    "ps_calc_15_bin", "ps_calc_16_bin", "ps_calc_17_bin", 
    "ps_calc_18_bin", "ps_calc_19_bin", "ps_calc_20_bin"  
] 
 
  
dfTrain = pd.read_csv(TRAIN_FILE) 
dfTest = pd.read_csv(TEST_FILE) 

Here is the dataset presentation

Next, we'll calculate the feature-map.This featrue-map defines how to convert the value of a variable to its corresponding feature-index.

feature_dict = {}
total_feature = 0
for col in df.columns:
    if col in IGNORE_COLS:
        continue
    elif col in NUMERIC_COLS:
        feature_dict[col] = total_feature
        total_feature += 1
    else:
        unique_val = df[col].unique()
        feature_dict[col] = dict(zip(unique_val,range(total_feature,len(unique_val) + total_feature)))
        total_feature += len(unique_val)
print(total_feature)
print(feature_dict)

The feature-index and feature-value relationships contained in feature_dict are shown in the figure below.

FM implementation

Next, you need to convert the training set to a new array, converting each piece of data to its corresponding feature-index and feature-value

print(dfTrain.columns)
train_y = dfTrain[['target']].values.tolist()
dfTrain.drop(['target','id'],axis=1,inplace=True)
train_feature_index = dfTrain.copy()
train_feature_value = dfTrain.copy()

for col in train_feature_index.columns:
    if col in IGNORE_COLS:
        train_feature_index.drop(col,axis=1,inplace=True)
        train_feature_value.drop(col,axis=1,inplace=True)
        continue
    elif col in NUMERIC_COLS:
        train_feature_index[col] = feature_dict[col]
    else:
        train_feature_index[col] = train_feature_index[col].map(feature_dict[col])
        train_feature_value[col] = 1

Next, define some parameters of the model, such as learning rate, embedding size, parameters of deep network, activation function, etc. and start the model training, there are three inputs of the training model, which are the feature index and the feature value just converted, and label:

import tensorflow as tf
import numpy as np
"""model parameter"""
dfm_params = {
    "use_fm":True,
    "use_deep":True,
    "embedding_size":8,
    "dropout_fm":[1.0,1.0],
    "deep_layers":[32,32],
    "dropout_deep":[0.5,0.5,0.5],
    "deep_layer_activation":tf.nn.relu,
    "epoch":30,
    "batch_size":1024,
    "learning_rate":0.001,
    "optimizer":"adam",
    "batch_norm":1,
    "batch_norm_decay":0.995,
    "l2_reg":0.01,
    "verbose":True,
    "eval_metric":'gini_norm',
    "random_seed":3
}
dfm_params['feature_size'] = total_feature
dfm_params['field_size'] = len(train_feature_index.columns)

feat_index = tf.placeholder(tf.int32,shape=[None,None],name='feat_index')
feat_value = tf.placeholder(tf.float32,shape=[None,None],name='feat_value')

label = tf.placeholder(tf.float32,shape=[None,1],name='label')

After defining the input, we can build the FM model as follows:

As shown below, we convert the input to Embedding, which is also the weight parameter of the term used in the FM part of the calculation; then we do the FM calculation

"""embedding"""
embeddings = tf.nn.embedding_lookup(weights['feature_embeddings'],feat_index)

reshaped_feat_value = tf.reshape(feat_value,shape=[-1,dfm_params['field_size'],1])

embeddings = tf.multiply(embeddings,reshaped_feat_value)


"""fm part"""
fm_first_order = tf.nn.embedding_lookup(weights['feature_bias'],feat_index)
fm_first_order = tf.reduce_sum(tf.multiply(fm_first_order,reshaped_feat_value),2)

summed_features_emb = tf.reduce_sum(embeddings,1)
summed_features_emb_square = tf.square(summed_features_emb)

squared_features_emb = tf.square(embeddings)
squared_sum_features_emb = tf.reduce_sum(squared_features_emb,1)

fm_second_order = 0.5 * tf.subtract(summed_features_emb_square,squared_sum_features_emb)

Deep

The Deep section is simple, consisting of several layers of fully connected neural networks:

y_deep = tf.reshape(embeddings,shape=[-1,dfm_params['field_size'] * dfm_params['embedding_size']])

for i in range(0,len(dfm_params['deep_layers'])):
    y_deep = tf.add(tf.matmul(y_deep,weights["layer_%d" %i]), weights["bias_%d"%i])
    y_deep = tf.nn.relu(y_deep)
    

In the final output section, the formulas in the paper are as follows:


"""final layer""" 
if dfm_params['use_fm'] and dfm_params['use_deep']: 
    concat_input = tf.concat([fm_first_order,fm_second_order,y_deep],axis=1)  
elif dfm_params['use_fm']:  
    concat_input = tf.concat([fm_first_order,fm_second_order],axis=1) 
elif dfm_params['use_deep']:  
    concat_input = y_deep 
  
out = tf.nn.sigmoid(tf.add(tf.matmul(concat_input,weights['concat_projection']),weights['concat_bias']))
    

Now that the architecture of our entire DeepFM model is built together, we can test the results of our model:


"""train""" 
with tf.Session() as sess:  
    sess.run(tf.global_variables_initializer()) 
    for i in range(100):  
        epoch_loss,_ = sess.run([loss,optimizer],feed_dict={feat_index:train_feature_index, 
                             feat_value:train_feature_value,  
                             label:train_y})  
        print("epoch %s,loss is %s" % (str(i),str(epoch_loss))
    

Above is the implementation of DeepFM, the whole process is implemented in PAI-Nootbook

Now in PAI-Studio, https://yq.aliyun.com/articles/742753?spm=a2c4e.11157919.spm-cont-list.8.146cf2042NrSK5 FM operation is also supported. See details, we splice TensorFlow DNN to achieve DeepFM-like effect.

We also hope that PAI-Studio will be more flexible in supporting module customization. Currently only Pyspark spark and Sql are supported, and uploading py code is not supported.

Tags: Python network Session Spark SQL

Posted on Thu, 14 May 2020 20:06:07 -0700 by ZaZall