Jump to content

User:Eddy.caron/DIET

From Wikipedia, the free encyclopedia
DIET
Developer(s)INRIA, École Normale Supérieure de Lyon, SysFera, CNRS, Claude Bernard University Lyon 1
Stable release
2.6.1 / 04/11/11
Written inC++, CORBA
Operating systemCross-platform
Type Grid and Cloud Computing
LicenseCeCILL
Websitegraal.ens-lyon.fr/DIET

DIET is a middleware that was created in 2000[1]. It was designed for high-performance computing. It is currently developed by INRIA, École Normale Supérieure de Lyon, SysFera, CNRS, Claude Bernard University Lyon 1. It is open-source software released under the CeCILL license.

Like NetSolve/GridSolve and Ninf, DIET is compliant with the GridRPC standard from the Open Grid Forum[2].

The aim of the DIET project is to develop a set of tools to build computational servers. The distributed resources are managed in a transparent way through the middleware. It can work with workstations, clusters, Grids and Clouds.

DIET is used to manage the Décrypthon Grid installed by IBM in 6 French universities (Bordeaux 1, Lille 1, Paris 6, ENS Lyon, Crihan in Rouen, Orsay).

Architecture

[edit]

Usually, GridRPC environments have five different components: clients that submit problems to servers, servers that solve the problems sent by clients, a database that contains information about software and hardware resources, a scheduler that chooses an appropriate server depending on the problem sent and the information contained in the database, and monitors that get information about the status of the computational resources.

DIET's architecture follows a different design. It is composed of:

  1. a client - the application that uses DIET to solve problems. Clients can connect to DIET from a web page or through an API or compiled program.
  2. a Master Agent (MA) that receives computation requests from clients. The MA then collects computation abilities from the servers and chooses one based on scheduling criteria. The reference of the chosen server is returned to the client. A client can be connected to an MA by a specific name server or a web page that stores the various MA locations.
  3. a Local Agent (LA) that aims at transmitting requests and information between MAs and servers. The information stored on an LA is the list of requests and, for each of its subtrees, the number of servers that can solve a given problem and information about the data distributed in this subtree. Depending on the underlying network topology, a hierarchy of LAs may be deployed between an MA and the servers.
  4. a Server Daemon (SeD) that is the point of entry of a computational server. It manages a processor or a cluster. The information stored on a SeD is the list of the data available on a server (possibly with their distribution and the way to access them), the list of the problems than can be solved on it, and all the information concerning its load (e.g., CPU capacity, available memory).

Multi-hierarchy

[edit]

Two approaches were developed:

  • a multi-MA extension was developed by the University of Franche-Comté. Those Master Agents are connected by a communication graph. Several DIET platforms are shared by interconnecting their respective Master Agent (MA). Clients request available SeDs from their MA as usual. If the MA finds an available SeD able to resolve the problem, it returns its reference to the client. If it does not find a SeD, it forwards the request to other MAs which can also forward it to other ones, and so on. When a MA finds a SeD which can resolve the client's request, it returns its reference to the client's MA which returns the reference to the client. The client can then use that SeD to resolve its problem.
  • a P2P Multi-MA extension called DIET_j was also designed. The aggregation of different independent DIET hierarchies (a multi-hierarchy architecture) could be managed using the P2P paradigm. This approach was based on the JXTA-J2SE toolbox for the on-demand discovery and connection of MAs. This project is no longer maintained.

Workflow Management

[edit]

For workflow management, DIET uses an additional entity called MA DAG. This entity can work in two modes: one in which it defines a complete scheduling of the workflow (ordering and mapping), and one in which it defines only an ordering for the workflow execution. Mapping is then done in the next step by the client, using the Master Agent to find the server where the workflow services should be run.

Scheduling

[edit]

DIET provides a degree of control over the scheduling subsystem via plug-in schedulers[3]. When a service request from an application arrives at a SeD, the SeD creates a performance-estimation vector, a collection of performance-estimation values that are pertinent to the scheduling process for that application. The values to be stored in this structure can be either values provided by CoRI (Collectors of Resource Information) or custom values generated by the SeD itself. The design of the estimation vector's subsystem is modular.

CoRI generates a basic set of performance-estimation values which are stored in the estimation vector and identified by system-defined tags. The following table lists the tags that may be generated by a standard CoRI installation.

Information tag starts with EST multi-value Explanation
TCOMP the predicted time to solve a problem (s)
TIMESINCELASTSOLVE time since the last solve has been made
FREECPU amount of free CPU between 0 and 1
LOADAVG average CPU load
FREEMEM amount of free memory (Mb)
NBCPU number of available CPUs
CPUSPEED Yes frequency of the CPUs (MHz)
TOTALMEM total memory size (Mb)
BOGOMIPS Yes the BogoMips
CACHECPU Yes cache size of the CPUs (Kb)
TOTALSIZEDISK size of the partition (Mb)
FREESIZEDISK amount of free space on partition (Mb)
DISKACCESREAD average time to read from disk (Mb/sec)
DISKACCESWRITE average time to write to disk (Mb/sec)
ALLINFOS Yes [empty] fill all possible fields

DIET Data Management

[edit]

Three differents data managers have been integrated into DIET:

  1. DTM from the University of Franche-Comté (not maintained);
  2. JuxMEM from the IRISA (not maintained)[4];
  3. DAGDA from École Normale Supérieure de Lyon.
File:Dagda-archi.png

DIET LRMS Management

[edit]

Parallel resources are generally accessible through a LRMS (Local Resource Management System), also called a batch system. DIET provides an interface with several existing LRMS to execute jobs: LoadLeveler on IBM resources, OpenPBS which is a fork of the well-know PBS system, and OAR developped by IMAG at Grenoble, and used on the Grid'5000 research grid. Most of the submitted jobs are parallel jobs, coded using the MPI standard with an instantiation such as MPICH or LAM.

Cloud-resource management

[edit]

A Cloud extension for DIET was created in 2009[5]. DIET is thus able to access Cloud resources through two existing Cloud providers:

  1. Eucalyptus, which is open-source software developed by the University of California, Santa Barbara.
  2. Amazon Elastic Compute Cloud, which is commercial software part of Amazon.com's cloud computing services.

References

[edit]
  1. ^ Caron, Eddy (2006). "DIET: A Scalable Toolbox to Build Network Enabled Servers on the Grid". International Journal of High Performance Computing Applications. 20 (3): 335–352. doi:10.1177/1094342006067472. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  2. ^ Caniou, Yves (2009). Grid Technology and Applications: Recent Developments. Chapter: High performance GridRPC middleware. Nova Science Publishers. ISBN 978-1-60692-768-7. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  3. ^ Caron, Eddy (january 2008). "Design of plug-in schedulers for a GridRPC environment". Future Generation Computer Systems. 24 (1): 46–57. doi:10.1016/j.future.2007.02.005. {{cite journal}}: Check date values in: |date= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)CS1 maint: date and year (link)
  4. ^ Antoniu, Gabriel (november 2005). "JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid". Scalable Computing: Practice and Experience. 6 (3): 45–55. {{cite journal}}: Check date values in: |date= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)CS1 maint: date and year (link)
  5. ^ Caron, Eddy (September 2009). "Cloud Computing Resource Management through a Grid Middleware: A Case Study with DIET and Eucalyptus". IEEE International Conference on Cloud Computing (CLOUD 2009): 151–154. doi:10.1109/CLOUD.2009.70. ISBN 978-1-4244-5199-9. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)CS1 maint: date and year (link)
[edit]


Category:Cloud computing Category:Grid computing products Category:Workflow technology