How to crawl/index the links on a single page: Google Search Appliance -


am new gsa , don't have full admin access system have forward requests through ict services have changes made our crawls , collections.

i hope can question:

i have single web page has list of links 180 documents (most of stored in same subdirectory /docs/ contains 2400 documents). rest scattered across site in number of other subdirectories ie /finance/, /hr/ etc

at moment happens either single webpage indexed , none of 180 links. or 1 page plus of 2400 documents in /docs/ subdirectory.

i want able crawl/index page , 180 links , create separate collection

is there simple way this?

regards henry

  1. instead of configuring url pattern under start urls , follow pattern, configure complete url. 180 urls + 1 single web page url , put 181 urls under start urls , follow pattern.by configuring complete urls, avoid gsa being crawling other urls in application not keeping common url pattern under follow urls.
  2. create new collection , place 180 doc urls + single web page url (or generic pattern matching 181 urls) in collection under "include content matching following patterns".

i assume not want index other 2400 documents on gsa. hope helps.

regards,

mohan.


Comments

Popular posts from this blog

java - Spring Data JPA: Why findOne(id) executing delete query internally? -

python - Mongodb How to add addtional information when aggregating? -

java - Incorrect order of records in M-M relationship in hibernate -