5. Archiving the Web – a Data Management Perspective

Author(s):  
Athena Vakali ◽  
George Pallis ◽  
Lefteris Angelis

The explosive growth of the Web scale has drastically increased information circulation and dissemination rates. As the number of both Web users and Web sources grows significantly everyday, crucial data management issues, such as clustering on the Web, should be addressed and analyzed. Clustering has been proposed towards improving both the information availability and the Web users’ personalization. Clusters on the Web are either users’ sessions or Web information sources, which are managed in a variation of applications and implementations testbeds. This chapter focuses on the topic of clustering information over the Web, in an effort to overview and survey on the theoretical background and the adopted practices of most popular emerging and challenging clustering research efforts. An up-to-date survey of the existing clustering schemes is given, to be of use for both researchers and practitioners interested in the area of Web data mining.


2013 ◽  
Vol 7 (2) ◽  
pp. 157-164 ◽  
Author(s):  
Jinchuan Chen ◽  
Yueguo Chen ◽  
Xiaoyong Du ◽  
Cuiping Li ◽  
Jiaheng Lu ◽  
...  

2018 ◽  
Author(s):  
SeaPlan

This report reviews existing observation systems, regional initiatives, existing infrastructure, data management concepts, and user issues. There are many active, high quality systems in place, both regional and national, and groups of dedicated scientists and data managers already working on data management systems. These systems generally live in isolation from the Web 2.0 world of rapidly evolving technologies to search and share data. Our recommendation is to leverage these science systems and the data that they provide, with the latest commonly available technology.


2021 ◽  
Author(s):  
Renato Alves ◽  
Dimitrios Bampalikis ◽  
Leyla Jael Castro ◽  
José María Fernández ◽  
Jennifer Harrow ◽  
...  

Data Management Plans are now considered a key element of Open Science. They describe the data management life cycle for the data to be collected, processed and/or generated within the lifetime of a particular project or activity. A Software Manag ement Plan (SMP) plays the same role but for software. Beyond its management perspective, the main advantage of an SMP is that it both provides clear context to the software that is being developed and raises awareness. Although there are a few SMPs already available, most of them require significant technical knowledge to be effectively used. ELIXIR has developed a low-barrier SMP, specifically tailored for life science researchers, aligned to the FAIR Research Software principles. Starting from the Four Recommendations for Open Source Software, the ELIXIR SMP was iteratively refined by surveying the practices of the community and incorporating the received feedback. Currently available as a survey, future plans of the ELIXIR SMP include a human- and machine-readable version, that can be automatically queried and connected to relevant tools and metrics within the ELIXIR Tools ecosystem and beyond.


Author(s):  
Ayuliana Ayuliana ◽  
Nailur Rahma ◽  
Titis Aulia ◽  
Ratri Permatasari

Padang Karunia Group received a lot of inputs from its employees for improvement collected manually by an administrator. The more input from employees, the more difficult to be collected manually. Therefore Padang Karunia Group took the initiative to create an intranet application to save the inputs in a database. Based on this problem, a research is done to develop database applications in the form of dashboard to collect criticisms and suggestions of employees over the web. This application is expected to assist and facilitate the management of the company, especially in the gathering criticism and suggestions from employees who are then evaluated for company improvement. We uses the database design method based on database application lifecycle, including requirements collection and analysis, conceptual database design, logical database design, DBMS selection, physical database design, and implementation. The results achieved are a database design anda web-based dashboard application that collects criticisms and suggestions. The advantage of using this application is that the company can easily accept criticism and suggestions from employees, simplify storage and data management criticism and suggestions, create reports, and reduce the cost of data collection and the process that was previously done manually. Through the database created, the data management of criticisms and suggestions of employees can be done better.


2019 ◽  
Vol 8 (1) ◽  
pp. 40-52 ◽  
Author(s):  
Sarah W. Kansa ◽  
Levent Atici ◽  
Eric C. Kansa ◽  
Richard H. Meadow

ABSTRACTWith the advent of the Web, increased emphasis on “research data management,” and innovations in reproducible research practices, scholars have more incentives and opportunities to document and disseminate their primary data. This article seeks to guide archaeologists in data sharing by highlighting recurring challenges in reusing archived data gleaned from observations on workflows and reanalysis efforts involving datasets published over the past 15 years by Open Context. Based on our findings, we propose specific guidelines to improve data management, documentation, and publishing practices so that primary data can be more efficiently discovered, understood, aggregated, and synthesized by wider research communities.


Author(s):  
Athena Vakali ◽  
Geroge Pallis ◽  
Lefteris Angelis

The explosive growth of the Web scale has drastically increased information circulation and dissemination rates. As the number of both Web users and Web sources grows significantly everyday, crucial data management issues, such as clustering on the Web, should be addressed and analyzed. Clustering has been proposed towards improving both the information availability and the Web users’ personalization. Clusters on the Web are either users’ sessions or Web information sources, which are managed in a variation of applications and implementations testbeds. This chapter focuses on the topic of clustering information over the Web, in an effort to overview and survey on the theoretical background and the adopted practices of most popular emerging and challenging clustering research efforts. An up-to-date survey of the existing clustering schemes is given, to be of use for both researchers and practitioners interested in the area of Web data mining.


Author(s):  
Amelia Badica ◽  
Costin Badica ◽  
Elvira Popescu

The Web is designed as a major information provider for the human consumer. However, information published on the Web is difficult to understand and reuse by a machine. In this chapter, we show how well established intelligent techniques based on logic programming and inductive learning combined with more recent XML technologies might help to improve the efficiency of the task of data extraction from Web pages. Our work can be seen as a necessary step of the more general problem of Web data management and integration.


Author(s):  
V. S. Kedrin ◽  
A. V. Rodyukov

The article considers the actual information technologies of the organization of a component distributed system for the distance enrollment of applicants within the framework of the 1C:Enterprise 8.3 platform. The concept of a modified component architecture of interaction with a web contour is proposed, which implementing the module of dynamic designing of web contour interface, as well as the automated functionality of data management directly on the standard 1C software products. The system principles for the organization of the contour of the website “Personal account of an applicant” have been formulated, they allow to integrate the management of the web contour into the circulation software solutions both for higher education (1C:University PROF) and for secondary vocational education (1C:College PROF) in terms of mechanisms for the dynamic designing of interfaces and the contour of data processing of site users. The implemented contour of designing site interfaces allows you to dynamically change the components of the site’s web forms, as well as to define the details displayed in the web user interface. A description of the mechanisms of dynamic interaction of the interfaces of the “Personal account of an applicant” site with the dynamic data management contour within the framework of the 1C:Enterprise 8.3 platform is given. The information components of the interaction contour of the site interfaces have been determined. The elements of the site control contour and their purpose have been specified. A conceptual universal scheme for the development of a web based data moderating contour has been formulated, and technologies for interaction with an internal accounting system for automating an admission campaign have been determined.


Sign in / Sign up

Export Citation Format

Share Document