Links Extractor. Even when a web page is not to be stored (because it was deemed irrelevant to the domain, or not in the targeted language), its links are extracted and added to the list of links scheduled to be visited. Since the crawling strategy is a critical issue for a focused crawler, a score sl is calculated for each link as follows: 11 Software package for general and focused Web-crawling, xxxx://xxxxxxx.xx.xxx.xx/ sl = p / L + ∑ ni ⋅ wi N i=1 where p is the relevance score of the source page, L is the amount of links originating from the source page, N is the amount of terms in the topic definition, ni denotes the number of occurrences of the i-th term in the surrounding text and wi is the weight of the i-th term. According to this approach, the link score is influenced by the source web page relevance score (see 2.1.5) and the estimated relevance of the link's anchor text.
Appears in 2 contracts
Samples: cordis.europa.eu, www.panacea-lr.eu
Links Extractor. Even when a web page is not to be stored (because it was deemed irrelevant to the domain, or not in the targeted language), its links are extracted and added to the list of links scheduled to be visited. Since the crawling strategy is a critical issue for a focused crawler, a score sl is calculated for each link as follows: 11 Software package for general and focused Web-crawling, xxxx://xxxxxxx.xx.xxx.xx/ sl = N p / L + ∑ ni ⋅ wi N i=1 i1 where p is the relevance score of the source page, L is the amount of links originating from the source page, N is the amount of terms in the topic definition, ni denotes the number of occurrences of the i-th term in the surrounding text and wi is the weight of the i-th term. According to this approach, the link score is influenced by the source web page relevance score (see 2.1.5) and the estimated relevance of the link's anchor text.
Appears in 2 contracts
Samples: cordis.europa.eu, repositori.upf.edu