µÚÒ»·¶ÎÄÍø - רҵÎÄÕ·¶ÀýÎĵµ×ÊÁÏ·ÖÏíÆ½Ì¨

¾ö²ßÊ÷Ëã·¨C4.5 

À´Ô´£ºÓû§·ÖÏí ʱ¼ä£º2025/8/16 10:53:24 ±¾ÎÄÓÉloading ·ÖÏí ÏÂÔØÕâÆªÎĵµÊÖ»ú°æ
˵Ã÷£ºÎÄÕÂÄÚÈݽö¹©Ô¤ÀÀ£¬²¿·ÖÄÚÈÝ¿ÉÄܲ»È«£¬ÐèÒªÍêÕûÎĵµ»òÕßÐèÒª¸´ÖÆÄÚÈÝ£¬ÇëÏÂÔØwordºóʹÓá£ÏÂÔØwordÓÐÎÊÌâÇëÌí¼Ó΢ÐźÅ:xxxxxxx»òQQ£ºxxxxxx ´¦Àí£¨¾¡¿ÉÄܸøÄúÌṩÍêÕûÎĵµ£©£¬¸ÐлÄúµÄÖ§³ÖÓëÁ½⡣

Êý¾Ý²Ö¿âÓëÊý¾ÝÍÚ¾ò

¡¶Êý¾Ý²Ö¿âÓëÊý¾ÝÍÚ¾ò¡·

¾ö²ßÊ÷Ëã·¨C4.5

±¾×é³ÉÔ±£º

07103218Íõά¹â 07103224 Ö£³½ 07103229ÁõÙ» 07103230ËÎè¡

Êý¾Ý²Ö¿âÓëÊý¾ÝÍÚ¾ò

Ò»£®±³¾°

×îÔçµÄ¾ö²ßʱËã·¨ÊÇÓÉHuntµÈÈËÓÚ1966ÄêÌá³öµÄCLS¡£µ±Ç°×îÓÐÓ°ÏìµÄ¾ö²ßÊ÷Ëã·¨ÊÇQuinlanÓÚ1986ÄêÌá³öµÄID3ºÍ1993ÄêÌá³öµÄC4.5¡£ID3Ö»ÄÜ´¦ÀíÀëÉ¢ÐÍÃèÊöÊôÐÔ£¬ËüÑ¡ÔñÐÅÏ¢ÔöÒæ×î´óµÄÊôÐÔ»®·ÖѵÁ·Ñù±¾£¬ÆäÄ¿µÄÊǽøÐзÖ֦ʱϵͳµÄìØ×îС£¬´Ó¶øÌá¸ßËã·¨µÄÔËËãËٶȺ;«È·¶È¡£ID3Ëã·¨µÄÖ÷ҪȱÏÝÊÇ£¬ÓÃÐÅÏ¢ÔöÒæ×÷ΪѡÔñ·ÖÖ¦ÊôÐԵıê׼ʱ£¬Æ«ÏòÓÚȡֵ½Ï¶àµÄÊôÐÔ£¬¶øÔÚijЩÇé¿öÏ£¬ÕâÀàÊôÐÔ¿ÉÄܲ»»áÌṩ̫¶àÓмÛÖµµÄÐÅÏ¢¡£C4.5ÊÇID3Ëã·¨µÄ¸Ä½øËã·¨£¬²»½ö¿ÉÒÔ´¦ÀíÀëÉ¢ÐÍÃèÊöÊôÐÔ£¬»¹ÄÜ´¦ÀíÁ¬ÐøÐÔÃèÊöÊôÐÔ¡£C4.5²ÉÓÃÁËÐÅÏ¢ÔöÒæ±È×÷ΪѡÔñ·ÖÖ¦ÊôÐԵıê×¼£¬ÃÖ²¹ÁËID3Ëã·¨µÄ²»×ã¡£

¾ö²ßÊ÷Ëã·¨µÄÓŵãÈçÏ£º£¨1£©·ÖÀྫ¶È¸ß£»£¨2£©³ÉµÄģʽ¼òµ¥£»£¨3£©¶ÔÔëÉùÊý¾ÝÓкܺõĽ¡×³ÐÔ¡£Òò¶øÊÇĿǰӦÓÃ×îΪ¹ã·ºµÄ¹éÄÉÍÆÀíËã·¨Ö®Ò»£¬ÔÚÊý¾ÝÍÚ¾òÖÐÊܵ½Ñо¿ÕߵĹ㷺¹Ø×¢¡£

¶þ£®C4.5¸Ä½øµÄ¾ßÌå·½Ãæ

1.ID3Ëã·¨´æÔÚµÄȱµã

£¨1£©ID3Ëã·¨ÔÚÑ¡Ôñ¸ù½ÚµãºÍ¸÷ÄÚ²¿½ÚµãÖеķÖÖ§ÊôÐÔʱ£¬²ÉÓÃÐÅÏ¢ÔöÒæ×÷ΪÆÀ¼Û±ê×¼¡£ÐÅÏ¢ÔöÒæµÄȱµãÊÇÇãÏòÓÚÑ¡Ôñȡֵ½Ï¶àµÄÊôÐÔ£¬ÔÚÓÐЩÇé¿öÏÂÕâÀàÊôÐÔ¿ÉÄܲ»»áÌṩ̫¶àÓмÛÖµµÄÐÅÏ¢¡£

£¨2£©ID3Ëã·¨Ö»ÄܶÔÃèÊöÊôÐÔΪÀëÉ¢ÐÍÊôÐÔµÄÊý¾Ý¼¯¹¹Ôì¾ö²ßÊ÷¡£

2£® C4.5Ëã·¨×ö³öµÄ¸Ä½ø

(1)ÓÃÐÅÏ¢ÔöÒæÂÊÀ´Ñ¡ÔñÊôÐÔ

¿Ë·þÁËÓÃÐÅÏ¢ÔöÒæÀ´Ñ¡ÔñÊôÐÔʱƫÏòÑ¡ÔñÖµ¶àµÄÊôÐԵIJ»×ã¡£ÐÅÏ¢ÔöÒæÂʶ¨ÒåΪ£º

ÆäÖÐGain(S,A)ÓëID3Ëã·¨ÖеÄÐÅÏ¢ÔöÒæÏàͬ£¬¶ø·ÖÁÑÐÅÏ¢SplitInfo(S,A)´ú±íÁ˰´ÕÕÊôÐÔA·ÖÁÑÑù±¾¼¯SµÄ¹ã¶ÈºÍ¾ùÔÈÐÔ¡£

Êý¾Ý²Ö¿âÓëÊý¾ÝÍÚ¾ò

ÆäÖУ¬S1µ½ScÊÇc¸ö²»Í¬ÖµµÄÊôÐÔA·Ö¸îS¶øÐγɵÄc¸öÑù±¾×Ó¼¯¡£ Èç°´ÕÕÊôÐÔA°ÑS¼¯£¨º¬30¸öÓÃÀý£©·Ö³ÉÁË10¸öÓÃÀýºÍ20¸öÓÃÀýÁ½¸ö¼¯ºÏ ÔòSplitInfo(S,A)=-1/3*log(1/3)-2/3*log(2/3) (2)¿ÉÒÔ´¦ÀíÁ¬ÐøÊýÖµÐÍÊôÐÔ

C4.5¼È¿ÉÒÔ´¦ÀíÀëÉ¢ÐÍÃèÊöÊôÐÔ£¬Ò²¿ÉÒÔ´¦ÀíÁ¬ÐøÐÔÃèÊöÊôÐÔ¡£ÔÚÑ¡Ôñij½ÚµãÉϵķÖÖ¦ÊôÐÔʱ£¬¶ÔÓÚÀëÉ¢ÐÍÃèÊöÊôÐÔ£¬C4.5µÄ´¦Àí·½·¨ÓëID3Ïàͬ£¬°´ÕÕ¸ÃÊôÐÔ±¾ÉíµÄȡֵ¸öÊý½øÐмÆË㣻¶ÔÓÚij¸öÁ¬ÐøÐÔÃèÊöÊôÐÔAc£¬¼ÙÉèÔÚij¸ö½áµãÉϵÄÊý¾Ý¼¯µÄÑù±¾ÊýÁ¿Îªtotal£¬C4.5½«×÷ÒÔÏ´¦Àí¡£ ? ½«¸Ã½áµãÉϵÄËùÓÐÊý¾ÝÑù±¾°´ÕÕÁ¬ÐøÐÍÃèÊöÊôÐԵľßÌåÊýÖµ£¬ÓÉСµ½´ó

½øÐÐÅÅÐò£¬µÃµ½ÊôÐÔÖµµÄȡֵÐòÁÐ{A1c£¬A2c£¬¡­¡­Atotalc}¡£

? ÔÚȡֵÐòÁÐÖÐÉú³Étotal-1¸ö·Ö¸îµã¡£µÚi£¨0

ÖµÉèÖÃΪVi=£¨Aic+A£¨i+1£©c£©/2,Ëü¿ÉÒÔ½«¸Ã½ÚµãÉϵÄÊý¾Ý¼¯»®·ÖΪÁ½¸ö×Ó¼¯¡£

? ´Ótotal-1¸ö·Ö¸îµãÖÐÑ¡Ôñ×î¼Ñ·Ö¸îµã¡£¶ÔÓÚÿһ¸ö·Ö¸îµã»®·ÖÊý¾Ý¼¯

µÄ·½Ê½£¬C4.5¼ÆËãËüµÄÐÅÏ¢ÔöÒæ±È£¬²¢ÇÒ´ÓÖÐÑ¡ÔñÐÅÏ¢ÔöÒæ±È×î´óµÄ·Ö¸îµãÀ´»®·ÖÊý¾Ý¼¯¡£ (3)²ÉÓÃÁËÒ»ÖÖºó¼ôÖ¦·½·¨

±ÜÃâÊ÷µÄ¸ß¶ÈÎÞ½ÚÖÆµÄÔö³¤£¬±ÜÃâ¹ý¶ÈÄâºÏÊý¾Ý£¬ ¸Ã·½·¨Ê¹ÓÃѵÁ·Ñù±¾¼¯±¾ÉíÀ´¹À¼Æ¼ô֦ǰºóµÄÎó²î£¬´Ó¶ø¾ö¶¨ÊÇ·ñÕæÕý¼ôÖ¦¡£·½·¨ÖÐʹÓõĹ«Ê½ÈçÏ£º

ÆäÖÐNÊÇʵÀýµÄÊýÁ¿£¬f=E/NΪ¹Û²ìµ½µÄÎó²îÂÊ£¨ÆäÖÐEΪN¸öʵÀýÖзÖÀà´íÎóµÄ¸öÊý£©£¬qÎªÕæÊµµÄÎó²îÂÊ£¬cΪÖÃÐŶȣ¨C4.5Ëã·¨µÄÒ»¸öÊäÈë²ÎÊý£¬Ä¬ÈÏֵΪ0.25£©£¬zΪ¶ÔÓ¦ÓÚÖÃÐŶÈcµÄ±ê×¼²î£¬ÆäÖµ¿É¸ù¾ÝcµÄÉ趨ֵͨ¹ý²éÕý̬·Ö²¼±íµÃµ½¡£Í¨¹ý¸Ã¹«Ê½¼´¿É¼ÆËã³öÕæÊµÎó²îÂÊqµÄÒ»¸öÖÃÐŶÈÉÏÏÞ£¬ÓôËÉÏÏÞΪ¸Ã½ÚµãÎó²îÂÊe×öÒ»¸ö±¯¹ÛµÄ¹À¼Æ£º

Êý¾Ý²Ö¿âÓëÊý¾ÝÍÚ¾ò

ͨ¹ýÅжϼô֦ǰºóeµÄ´óС£¬´Ó¶ø¾ö¶¨ÊÇ·ñÐèÒª¼ôÖ¦¡£ (4)¶ÔÓÚȱʧֵµÄ´¦Àí

ÔÚijЩÇé¿öÏ£¬¿É¹©Ê¹ÓõÄÊý¾Ý¿ÉÄÜȱÉÙijЩÊôÐÔµÄÖµ¡£¼ÙÈç¡´x£¬c(x)¡µÊÇÑù±¾¼¯SÖеÄÒ»¸öѵÁ·ÊµÀý£¬µ«ÊÇÆäÊôÐÔAµÄÖµA(x)δ֪¡£´¦ÀíȱÉÙÊôÐÔÖµµÄÒ»ÖÖ²ßÂÔÊǸ³¸øËü½áµãnËù¶ÔÓ¦µÄѵÁ·ÊµÀýÖиÃÊôÐÔµÄ×î³£¼ûÖµ£»ÁíÍâÒ»ÖÖ¸ü¸´ÔӵIJßÂÔÊÇΪAµÄÿ¸ö¿ÉÄÜÖµ¸³ÓèÒ»¸ö¸ÅÂÊ¡£ÀýÈ磬¸ø¶¨Ò»¸ö²¼¶ûÊôÐÔA£¬Èç¹û½áµãn°üº¬6¸öÒÑÖªA=1ºÍ4¸öA=0µÄʵÀý£¬ÄÇôA(x)=1µÄ¸ÅÂÊÊÇ0.6£¬¶øA(x)=0µÄ¸ÅÂÊÊÇ0.4¡£ÓÚÊÇ£¬ÊµÀýxµÄ60%±»·ÖÅäµ½A=1µÄ·ÖÖ§£¬40%±»·ÖÅäµ½ÁíÒ»¸ö·ÖÖ§¡£ÕâЩƬ¶ÏÑùÀý£¨fractional examples£©µÄÄ¿µÄÊǼÆËãÐÅÏ¢ÔöÒæ£¬ÁíÍ⣬Èç¹ûÓеڶþ¸öȱÉÙÖµµÄÊôÐÔ±ØÐë±»²âÊÔ£¬ÕâЩÑùÀý¿ÉÒÔÔÚºó¼ÌµÄÊ÷·ÖÖ§Öб»½øÒ»²½Ï¸·Ö¡£C4.5¾ÍÊÇʹÓÃÕâÖÖ·½·¨´¦ÀíȱÉÙµÄÊôÐÔÖµ¡£

3. C4.5Ëã·¨µÄÓÅȱµã

Óŵ㣺²úÉúµÄ·ÖÀà¹æÔòÒ×ÓÚÀí½â£¬×¼È·Âʽϸߡ£

ȱµã£ºÔÚ¹¹ÔìÊ÷µÄ¹ý³ÌÖУ¬ÐèÒª¶ÔÊý¾Ý¼¯½øÐжà´ÎµÄ˳ÐòɨÃèºÍÅÅÐò£¬Òò¶øµ¼ÖÂËã·¨µÄµÍЧ¡£´ËÍ⣬C4.5Ö»ÊʺÏÓÚÄܹ»×¤ÁôÓÚÄÚ´æµÄÊý¾Ý¼¯£¬µ±ÑµÁ·¼¯´óµÃÎÞ·¨ÔÚÄÚ´æÈÝÄÉʱ³ÌÐòÎÞ·¨ÔËÐС£

Èý£®C4.5Ëã·¨Ô´´úÂ루C++£©

// C4.5_test.cpp : Defines the entry point for the console application. //

#include \#include #include #include \#include

const int MAX = 10;

Êý¾Ý²Ö¿âÓëÊý¾ÝÍÚ¾ò

int** iInput; int i = 0;//ÁÐÊý int j = 0;//ÐÐÊý

void build_tree(FILE *fp, int* iSamples, int* iAttribute,int ilevel);//Êä³ö¹æÔò

int choose_attribute(int* iSamples, int* iAttribute);//ͨ¹ý¼ÆËãÐÅÏ¢ÔöÒæÂÊÑ¡³ötest_attribute

double info(double dTrue,double dFalse);//¼ÆËãÆÚÍûÐÅÏ¢

double entropy(double dTrue, double dFalse, double dAll);//ÇóìØ double splitinfo(int* list,double dAll);

int check_samples(int *iSamples);//¼ì²ésamplesÊÇ·ñ¶¼ÔÚͬһ¸öÀàÀï int check_ordinary(int *iSamples);//¼ì²é×îÆÕͨµÄÀà

int check_attribute_null(int *iAttribute);//¼ì²éattributeÊÇ·ñΪ¿Õ

void get_attributes(int *iSamples,int *iAttributeValue,int iAttribute);

int _tmain(int argc, _TCHAR* argv[]) {

FILE *fp; FILE *fp1;

char iGet; int a = 0;

int b = 0;//a,bÊÇÑ­»·±äÁ¿ int* iSamples; int* iAttribute;

fp = fopen(\ if (NULL == fp) {

printf(\ return 0; }

iGet = getc(fp);

while (('\\n' != iGet)&&(EOF != iGet)) {

if (',' == iGet) {

i++; }

ËÑË÷¸ü¶à¹ØÓÚ£º ¾ö²ßÊ÷Ëã·¨C4.5  µÄÎĵµ
¾ö²ßÊ÷Ëã·¨C4.5 .doc ½«±¾ÎĵÄWordÎĵµÏÂÔØµ½µçÄÔ£¬·½±ã¸´ÖÆ¡¢±à¼­¡¢ÊղغʹòÓ¡
±¾ÎÄÁ´½Ó£ºhttps://www.diyifanwen.net/c90wss2kzn0175lm25rnc_1.html£¨×ªÔØÇë×¢Ã÷ÎÄÕÂÀ´Ô´£©
ÈÈÃÅÍÆ¼ö
Copyright © 2012-2023 µÚÒ»·¶ÎÄÍø °æÈ¨ËùÓÐ ÃâÔðÉùÃ÷ | ÁªÏµÎÒÃÇ
ÉùÃ÷ :±¾ÍøÕ¾×ðÖØ²¢±£»¤ÖªÊ¶²úȨ£¬¸ù¾Ý¡¶ÐÅÏ¢ÍøÂç´«²¥È¨±£»¤ÌõÀý¡·£¬Èç¹ûÎÒÃÇ×ªÔØµÄ×÷Æ·ÇÖ·¸ÁËÄúµÄȨÀû,ÇëÔÚÒ»¸öÔÂÄÚ֪ͨÎÒÃÇ£¬ÎÒÃǻἰʱɾ³ý¡£
¿Í·þQQ£ºxxxxxx ÓÊÏ䣺xxxxxx@qq.com
ÓåICP±¸2023013149ºÅ
Top