

字号+ 作者: 来源: 2017-04-27

Slicing Sequence...Training MGS Random Forests...Slicing Sequence...Training MGS Random Forests...Adding/Training Layer, n_layer=1Layer validation accuracy = 0.5964125560538116Adding/Training Layer,

  Slicing Sequence...Training MGS Random Forests...Slicing Sequence...Training MGS Random Forests...Adding/Training Layer, n_layer=1Layer validation accuracy = 0.5964125560538116Adding/Training Layer, n_layer=2Layer validation accuracy = 0.5695067264573991

参数改为shape_1X=[1,13], window=[1,6]后训练集达到0.59,不理想,这里只是抛砖引玉,调参需要大神指导。

Now checking the prediction for the test set:


  pred_X = gcf.predict(X_te)print(len(pred_X))print(len(y_te))print(pred_X) Slicing Sequence...Slicing Sequence...549549[1 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 0 1 0 1 0 1 0 0 0 0 0 0 1 1 1 0 0 1 0等#最近预测

  fori inrange( 1,len(pred_X)): print(y_te[-i],pred_X[-i],-i)

0 1 -10 0 -21 0 -31 0 -40 1 -5

  等 # 保存每一天预测的结果,如果某天预测对了,保存1,如果某天预测错了,保存-1

  result_list = []

  # 检查预测是否成功

  defcheckPredict(i):ifpred_X[i] == y_te[i]: result_list.append( 1)

  else: result_list.append( 0)


  k= 0j



  fori inrange(len(y_te)-j*(k+ 1), len(y_te)-j*k): checkPredict(i)

  #print(y_pred[i])#return result_list

  print(len(y_te) ) print(len(result_list) )

  importmatplotlib.pyplot asplt


  x = range( 0, len(result_list))y = []


  fori inrange( 0, len(result_list)):

  #y.append((1 + float(sum(result_list[:i])) / (i+1)) / 2)y.append( float(sum(result_list[:i])) / (i+ 1))print( '最近',j, '次准确率',y[- 1])print(x, y)line, = plt.plot(x, y)plt.show 549549最近 549 次准确率 0.5300546448087432range(0, 549) [0.0, 0.0, 0.3333333333333333, 0.25等



  # evaluating accuracy

  accuracy = accuracy_score(y_true=y_te, y_pred=pred_X)print( 'gcForest accuracy : {}'.format(accuracy)) gcForest accuracy : 0.5300546448087432




  # loading the data

  digits = load_digits()X = digits.datay = digits.targetX_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size= 0.4)gcf = gcForest(shape_1X=[ 7, 8], window=[ 4, 6], tolerance= 0.0, min_samples_mgs= 10, min_samples_cascade= 7)

  #gcf = gcForest(shape_1X=13, window=13, tolerance=0.0, min_samples_mgs=10, min_samples_cascade=7)

  gcf.fit(X_tr, y_tr)

  Slicing Images...Training MGS Random Forests...Slicing Images...Training MGS Random Forests...Adding/Training Layer, n_layer=1Layer validation accuracy = 0.9814814814814815Adding/Training Layer, n_layer=2Layer validation accuracy = 0.9814814814814815# evaluating accuracy

  accuracy = accuracy_score(y_true=y_te, y_pred=pred_X)print( 'gcForest accuracy : {}'.format(accuracy))

  gcForest accuracy : 0.980528511821975





  gcf = gcForest(shape_1X=[ 8, 8], window= 5, min_samples_mgs= 10, min_samples_cascade= 7)X_tr_mgs = gcf.mg_scanning(X_tr, y_tr)

  Slicing Images...Training MGS Random Forests...

It is now possible to use the mg_scanning output as input for cascade forests using different parameters. Note that the cascade forest module does not directly return predictions but probability predictions from each Random Forest in the last layer of the cascade. Hence the need to first take the mean of the output and then find the max.

  gcf = gcForest(tolerance= 0.0, min_samples_mgs= 10, min_samples_cascade= 7)_ = gcf.cascade_forest(X_tr_mgs, y_tr)

  Adding/Training Layer, n_layer=1Layer validation accuracy = 0.9722222222222222Adding/Training Layer, n_layer=2Layer validation accuracy = 0.9907407407407407Adding/Training Layer, n_layer=3Layer validation accuracy = 0.9814814814814815

  importnumpy asnppred_proba = gcf.cascade_forest(X_te_mgs)tmp = np.mean(pred_proba, axis= 0)preds = np.argmax(tmp, axis= 1)accuracy_score(y_true=y_te, y_pred=preds)gcf = gcForest(tolerance= 0.0, min_samples_mgs= 20, min_samples_cascade= 10)_ = gcf.cascade_forest(X_tr_mgs, y_tr)pred_proba = gcf.cascade_forest(X_te_mgs)tmp = np.mean(pred_proba, axis= 0)preds = np.argmax(tmp, axis= 1)accuracy_score(y_true=y_te, y_pred=preds) 0.97774687065368571Adding/Training Layer, n_layer=1Layer validation accuracy = 0.9629629629629629Adding/Training Layer, n_layer=2Layer validation accuracy = 0.9675925925925926Adding/Training Layer, n_layer=3Layer validation accuracy = 0.9722222222222222Adding/Training Layer, n_layer=4Layer validation accuracy = 0.97222222222222220.97218358831710705

  Skipping mg_scanning

It is also possible to directly use the cascade forest and skip the multi grain scanning step.

  gcf = gcForest(tolerance= 0.0, min_samples_cascade= 20)_ = gcf.cascade_forest(X_tr, y_tr)


