100 likes | 115 Views
Explore the integration of machine learning in portals for easier content gathering, organization, and search optimization. Dive into the importance, how machine learning works, applications, and advantages over traditional methods. Discover the features like spidering, reinforcement learning, and information extraction to enhance user experience.
E N D
INTERNET PORTALS WITH MACHINE LEARNING PRESENTED BY:- AMRIT CHOUDHARY BTECH- CSE 7TH SEM
WHY PORTALS ??? • Gather content from Web organize it for easy access, retrieval and search. • Eg. www.twenty19.com, • Disadvantage:- • These portals are difficult and time-consuming to maintain. • Soln… • My project proposes use machine learning techniques to greatly automate creation and maintenance of portals.
MACHINE LEARNING Study of computer algorithms that improve automatically through experience.IMPORTANCE OF MACHINE LEARNING:- • 4 general categories task’s which are impossible or difficult. • Problems ,no human expert available. • Human experts available ,no explanation of expertise. • Problems where phenomena changes rapidly. • Applications to be customized for each computer. • HOW MACHINE LEARNS ??? • ASSIGNING WEIGTHS:- • some weight assigned and compared with previous results stored. • DECESION TREES:- • System starts from parent node with techniques of BDS and DPS.
FORMAL GRAMMARS:- • CRESTON :- A new rule is constructed by the system or acquired from an external entity • GENERALISATION:- Conditions dropped / made less restrictive, so that the rule applies in a larger number of situations. • SPECIALIZATION:- Additional conditions added to existing conditions made more restrictive, so that the rule applies to specific situations. • APPLICATIONS:- • Optical Character Recognition(OCR) • Face Detection • Spam Filtering • medical diagnosis • spoken language understanding • fraud detection • PLAN:- E-BOOK PORTAL
MACHINE LEARNING FEATURES INCLUDED IN PORTALS • CLASSIFICATION INTO TOPIC HIERARCHY:- • Efficiently organize, view and explore large • quantities of information. • SPIDERING:- • Spider efficiently explores Web, following links that are more likely to lead to e-books. • Each reference broken down into appropriate fields, such as author, title, journal, and date. • WEBWATCHER • Tour guide, highlights hyperlinks that it believes will be of interest
REINFORCEMENT LEARNING • Learning optimal decision making from rewards or punishment. • Goal of reinforcement learning:- learn a policy, a mapping from states to actions, that maximizes the sum of reward over time. • supervised learning:-Told correct action for particular state • ADVANTAGE OVER SUPERVISED LEARNING:- • instead it is told how good or bad the selected action was, expressed in the form of “scalar reward”.
INFORMATION EXTRACTION • Information extraction, identifying phrases of interest in textual data. • powerful way ,summarize’s the information relevant to a user's needs. • Eg. On- topic documents may be several hyperlinks away from the current choice point; but the text on the current page may offer indications of which hyper link will lead to reward soonest. • ADVANTAGE:- • Allow’ssearches over specific fields. • Effective presentation of search result(Shows in bold)
CONCLUSION:- • In addition to future work discussed earlier, many • other areas where machine learning can further automate the construction and maintenance of domain-specific search engines. Eg. Text classification can decide which documents on the Web are relevant to the domain. • This paper has shown that machine learning techniques can significantly aid the creation and maintenance of portals and domain-specific search engines.
ADVANTAGE’S • These techniques allow portals quick creation with minimal effort. • Performance is based on the rewards over time. • The environment presents situations with delayed rewards. • DISADVANTAGE’S • Backtracking:- algorithm fails to backtrack to the original path resulting in deadlock state. • Specify initial and goal states ,specify rules and modify the rules sometimes if necessary. • If knowledge base of expert system ,not correct or lack facts & figures, solutions thus acquired ineffective.