Skip to main content

Arquivo.pt: the Portuguese web-archive

Arquivo.pt - The Portuguese web-archive (PWA) is the national Web archive of Portugal. Its mission is to periodically archive contents of national interest available on the Web, storing and preserving for future generations information of historical relevance. It is a service of the Foundation for Science and Technology (FCT).


rss RSS

31,973
RESULTS


Show sorted alphabetically

Show sorted alphabetically

SHOW DETAILS
up-solid down-solid
eye
Title
Date Archived
Creator
Arquivo.pt: the Portuguese web-archive
web

eye 10,216

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 17 May 2011 and 17 June 2011 mainly from .PT domain. The AWP10 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP10 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 19,073

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 17 May 2011 and 17 June 2011 mainly from .PT domain. The AWP10 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP10 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Sixth collection FAWP. With Deduplicator.
Topics: Frequent crawl of news media from Portuguese web, Portuguese Web Archive, Portuguese online...
Arquivo.pt: the Portuguese web-archive
web

eye 11,891

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 17 May 2011 and 17 June 2011 mainly from .PT domain. The AWP10 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP10 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty Three collection AWP. Without Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Sixth collection FAWP. With Deduplicator.
Topics: Frequent crawl of news media from Portuguese web, Portuguese Web Archive, Portuguese online...
Arquivo.pt: the Portuguese web-archive
web

eye 8,529

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 17 May 2011 and 17 June 2011 mainly from .PT domain. The AWP10 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP10 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 27,447

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 17 May 2011 and 17 June 2011 mainly from .PT domain. The AWP10 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP10 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
by Portuguese Web Archive
web

eye 50,778

favorite 0

comment 0

Complete crawl of the Portuguese web performed between March and May 2008 mainly from .PT domain..
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-sixtieth collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 12,938

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 30 June 2011 and 5 August 2011 mainly from .PT domain. The AWP11 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP11 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 2,232

favorite 0

comment 0

Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 26,560

favorite 0

comment 0

Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 15,604

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 17 May 2011 and 17 June 2011 mainly from .PT domain. The AWP10 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP10 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty two collection AWP. Without Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-fifth collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-sixtieth collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty Three collection AWP. Without Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-sixtieth collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty two collection AWP. Without Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty Three collection AWP. Without Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-seventh collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-ninth collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty Three collection AWP. Without Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-seventh collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty one collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 275

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
First collection FAWP. Without Deduplicator.
Topics: Frequent crawl of news media from Portuguese web, Portuguese Web Archive, Portuguese online...
Arquivo.pt: the Portuguese web-archive
web

eye 3,218

favorite 0

comment 0

Incremental crawl of the Portuguese web performed in August 2010 mainly from .PT domain. The AWP8 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP8 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty Three collection AWP. Without Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Sixth collection FAWP. With Deduplicator.
Topics: Frequent crawl of news media from Portuguese web, Portuguese Web Archive, Portuguese online...
Sixth collection FAWP. With Deduplicator.
Topics: Frequent crawl of news media from Portuguese web, Portuguese Web Archive, Portuguese online...
Twenty-ninth collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 770

favorite 0

comment 0

Complete crawl of the Portuguese web performed between 5 February 2016 and 3 May 2016 mainly from .PT domain. The AWP20 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 2,073

favorite 0

comment 0

Complete crawl of the Portuguese web performed in October 2009 mainly from .PT domain..
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Second collection FAWP. Without Deduplicator.
Topics: Frequent crawl of news media from Portuguese web, Portuguese Web Archive, Portuguese online...
Arquivo.pt: the Portuguese web-archive
web

eye 85,347

favorite 0

comment 0

Complete crawl of the Portuguese web performed in October 2009 mainly from .PT domain..
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 5,802

favorite 0

comment 0

Complete crawl of the Portuguese web performed in May 2010 mainly from .PT domain.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 49,304

favorite 0

comment 0

Complete crawl of the Portuguese web performed between October and December 2008 mainly from .PT domain..
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 743

favorite 0

comment 0

Complete crawl of the Portuguese web performed between 5 February 2016 and 3 May 2016 mainly from .PT domain. The AWP20 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-sixtieth collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 2,449

favorite 0

comment 0

Complete crawl of the Portuguese web performed between 10 April 2015 and 9 June 2015 mainly from .PT domain. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Sixth collection FAWP. With Deduplicator.
Topics: Frequent crawl of news media from Portuguese web, Portuguese Web Archive, Portuguese online...
Sixth collection FAWP. With Deduplicator.
Topics: Frequent crawl of news media from Portuguese web, Portuguese Web Archive, Portuguese online...
Twenty-fourth collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 30 May 2016 and 3 August 2016 mainly from .PT domain. The AWP21 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP20 as baseline. Thus, the files that remained unchanged from the AWP20 complete crawl were not archived (duplicated) on the AWP21 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 863

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty Three collection AWP. Without Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-eighth collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-eighth collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 10,125

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 17 May 2011 and 17 June 2011 mainly from .PT domain. The AWP10 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP10 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-seventh collection AWP. No Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Incremental crawl of the Portuguese web performed between 30 May 2016 and 3 August 2016 mainly from .PT domain. The AWP21 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP20 as baseline. Thus, the files that remained unchanged from the AWP20 complete crawl were not archived (duplicated) on the AWP21 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 343

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 12 November 2015 and 5 January 2015 mainly from .PT domain. The AWP19 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP18 as baseline. Thus, the files that remained unchanged from the AWP18 complete crawl were not archived (duplicated) on the AWP19 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 12,563

favorite 0

comment 0

Complete crawl of the Portuguese web performed in May 2010 mainly from .PT domain.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Fourth collection FAWP. Without Deduplicator.
Topics: Frequent crawl of news media from Portuguese web, Portuguese Web Archive, Portuguese online...
Arquivo.pt: the Portuguese web-archive
web

eye 11,831

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 20 January 2011 and 22 March 2011 mainly from .PT domain. The AWP9 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP9 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Sixth collection FAWP. With Deduplicator.
Topics: Frequent crawl of news media from Portuguese web, Portuguese Web Archive, Portuguese online...
Arquivo.pt: the Portuguese web-archive
web

eye 2,153

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 13 August 2015 and 5 November 2015 mainly from .PT domain. The AWP18 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP17 as baseline. Thus, the files that remained unchanged from the AWP17 complete crawl were not archived (duplicated) on the AWP18 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 238

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 12 November 2015 and 5 January 2015 mainly from .PT domain. The AWP19 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP18 as baseline. Thus, the files that remained unchanged from the AWP18 complete crawl were not archived (duplicated) on the AWP19 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 293

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 12 November 2015 and 5 January 2015 mainly from .PT domain. The AWP19 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP18 as baseline. Thus, the files that remained unchanged from the AWP18 complete crawl were not archived (duplicated) on the AWP19 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 1,026

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 12 November 2015 and 5 January 2015 mainly from .PT domain. The AWP19 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP18 as baseline. Thus, the files that remained unchanged from the AWP18 complete crawl were not archived (duplicated) on the AWP19 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 450

favorite 0

comment 0

Complete crawl of the Portuguese web performed between 5 November 2013 and 13 January 2014 mainly from .PT domain. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/).
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 199

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 4,434

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 12 November 2015 and 5 January 2015 mainly from .PT domain. The AWP19 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP18 as baseline. Thus, the files that remained unchanged from the AWP18 complete crawl were not archived (duplicated) on the AWP19 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Arquivo.pt: the Portuguese web-archive
web

eye 1,122

favorite 0

comment 0

Incremental crawl of the Portuguese web performed between 12 November 2015 and 5 January 2015 mainly from .PT domain. The AWP19 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP18 as baseline. Thus, the files that remained unchanged from the AWP18 complete crawl were not archived (duplicated) on the AWP19 incremental crawl.
Topics: Incremental crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-fourth collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Thirty one collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-fourth collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...
Twenty-fourth collection AWP. With Deduplicator.
Topics: Complete crawl of the Portuguese web, Portuguese Web Archive, Portuguese online publications,...