A Performance Contract System in a Grid Enabling Component Based Programming Environment

X. Xxxxxx, X. Xxxxxxxx, X. Xxxxxxx

XX-XXXX-XX-00-00 Ottobre 2004

Consiglio Nazionale delle Ricerche, Istituto di Calcolo e Reti ad Alte Prestazioni (ICAR)

– Sezione di Napoli, Complesso Universitario Monte X. Xxxxxx Xxx Xxxxxx, 00000 Xxxxxx, URL: xxx.xx.xxxx.xxx.xx

A Performance Contract System in a Grid Enabling Component Based Programming Environment*

X. Xxxxxx 0 , X. Xxxxxxxx 0 , X. Xxxxxxx 2

TR-ICAR-NA-04-16 Ottobre 2004

1 Istituto di Calcolo e Reti ad Alte Prestazioni, ICAR-CNR, Sezione di Napoli, xxx Xxxxxx Xxxxx X.Xxxxxx, 00000 Xxxxxx - xxxxxxxx@xxx.xx

2 Università degli Studi di Xxxxxx Xxxxxxxx XX, Complesso Monte X. Xxxxxx, Xxx Xxxxxx, 00000 Xxxxxx xxxxxxxx.xxxxxxxx@xxx.xxxxx.xx

xxxxx.xxxxxxx@xxx.xxxxx.xx

* Questo lavoro è parzialmente supportato dal Progetto Nazionale di Ricerca , FIRB, Xxxx.xx nell’ambito del workpackage WP9 “Grid Enabled Scientific Libraries”

I rapporti tecnici dell’ICAR-CNR sono pubblicati dall’Istituto di Calcolo e Reti ad Alte Prestazioni del Consiglio Nazionale delle Ricerche. Tali rapporti, approntati sotto l’esclusiva responsabilità scientifica degli autori, descrivono attività di ricerca del personale e dei collaboratori dell’ICAR, in alcuni casi in un formato preliminare prima della pubblicazione definitiva in altra sede.

A Performance Contract System in a Grid Enabling Component Based Programming Environment*

Pasquale Caruso1, Giuliano Laccetti2, Marco Lapegna2 1Institute of High Performance Computing and Networking

National Research Council, Naples branch – xxx Xxxxxx Xxxxx X.Xxxxxx, Xxxxxx, Xxxxx xxxxxxxx@xxx.xx

2Department of Mathematics and its Applications

University of Xxxxxx Xxxxxxxx XX, Xxx Xxxxxx Xxxxx X. Xxxxxx, 00000 Xxxxxx, Xxxxx

{giuliano.laccetti, xxxxx.xxxxxxx}@xxx.xxxxx.xx

Abstract. In these years, Grid computing is probably the most promising approach for building large scale and cost effective applications. However, this very popular approach needs a sophisticated software infrastructure to address several requirements. One of these requirements is the ability to sustain a predictable performance in front to the fluctuations related to the dynamic nature of the Grid. In this paper we describe design and realization of a Performance Contract System, a software infrastructure that manages the computational kernel of a grid application with the aim to face such aspect of the Grid computing, as well as the strategies and the experiences to integrate it in a grid-enabling component-based programming environment still under development.

1. Introduction

As stated in [9], a Grid is a system that “… coordinates resources that are not subject to centralized control, … using standard, open, general purpose protocols and interfaces,… to deliver non trivial qualities of services”. That means that a Grid infrastructure is built on the top of a collection of disparate and distributed resources (computers, databases, network, software …) with functionalities greater than the simple sum of those addends [10]. The “added value” is a software architecture aimed to deliver good Quality of Service (QoS), so a stronger attention has been recently given on the technologies enabling it (see for example [12]). Inside this software infrastructure, a significant part, known as Performance Contracts System, is devoted to the aspects related to the response time and to the delivered performance.

* This work has been partially supported by Italian Ministero dell’Istruzione, Universita’ e Ricerca (MIUR) within the activities of the WP9 workpackage “Grid Enabled Scientific Libraries” , coordinated by X. Xxxxx, part of the MIUR FIRB Xxxx.xx project “Piattaforme Abilitanti per Griglie Computazionali ad Alte Prestazioni Orientate ad Organizzazioni Virtuali Scalabili”.

Grid topics related to performance contract systems have been widely studied, in the last years, and mainly in the GraDS project (see for example [1,2]). Other papers (see [16]) report studies about the forecast of the performances in distributed computing environment, by using algorithms simulating an ideal customer, opportunely defined by means of some rules of behavior, in terms of use of the resources. In [20] is defined a Performance Contract of an application for a computational grid by means of the computational cost of the algorithm, and then it is checked and validated. In [ 15,23 ], on the other hand, are introduced approaches that make use of statistical data. Finally in [17] results related to the development of performance contracts based on the fitting of runtime data are reported. With regard to the run time monitoring, several tools for distributed applications are available and they will be shortly described in section 4 [ 19,21,28]. In [18] it is developed a statistical analysis that, instead of checking all the software modules of the application, uses only some meaningful sections of the application itself.

This paper is therefore organized as follows: in section 2 we outline our

Performance Contracts System and its role in a grid application; in section 3 we introduce the software environment in which the Performance Contract System will be integrated; finally, in section 4, we show some computational results about the definition of the performance contract and the related monitoring of a parallel routine that is part of a medical application.

2. The role of a Performance Contract System in a Grid Application

One of the aspects of grid computing is the simple and transparent use of the computational resources of a distributed system [10]. To such aim it is “mandatory”, in some way, the presence of several software units, that are side by side to the application. Among the tasks of such software modules there are, for example, the selection of the resources, the development of the performance contract, its monitoring and the management of possible violations of the contract itself. The module that manages all activities is the Application Manager (AM), whose outline (or workflow) is depicted in Figure 1. Its main tasks are:

1. selection of the computational resources on the basis of information about the application (e.g. dimension of the problem), the user (e.g. priority required), the state of the network (e.g. resources available in that moment) and previous executions (e.g. performances caught up on a machine already used). See [ 8] for an example of selection of the computational resources;

2. definition of the performance contract on the basis of information related to performance model characteristics of the application (e.g. based on the computational cost), and to the resources selected in step 1 as well as on the basis of information from previous executions (e.g. stored in a “historical performance database”); more details related to this aspect are reported in the next section;

3. monitoring of the application; this is a very important aspect for a reliable grid enabled applications, because the actual performance can be very different from that one specified in the Performance Contract. The dynamic nature of distributed

resources not under the same centralized control, can do these values very different among them. Some existing tool for the monitoring of distributed application are shortly described in the next section.

4. management of eventual violations of the contract (e.g. migration of the application on other resources, redefinition of the terms of the contract or continuing the execution also in presence of contract violations). See [24] for an example of migration strategy.

user req..

resources

contract

monitor.

resource broker

contract definit.

launcher

violat. managem.

history dbase

Applicat.

end

Store load data

Stop/restart appl..

Grid

Fig. 1. workflow of the Application Manager

A Performance Contracts System is the set of all software units of the Application Manager related to the performance contract and its monitoring. The definition of a Performance Contract is not a new one (see for example [27]), but in a grid environment it assumes a key role. A performance contract can be defined as a forecast of the performances of an application on given computational resource. More precisely, assigned:

• some computational resources (e.g.. processors, memories, networks...)

• with given capability (e.g. computation speed, memory bandwidth...)

• and an application with given characteristics (e.g.. dimension of the problem, amount of I/O, number of operations..)

a performance contract states

• the achievement of fixed performances (e.g.. attainment of F operations/sec, execution of I iterations/sec , transfer of B bytes/sec , )

Once selected the computational resources, the definition of the performance contract is essentially based on the following information:

1. use of a performance model based on the features of the application and of the selected computational resources. To be realistic, the definition of the model must take in to account the computational cost of the algorithm, but also the workload of the resources, the fraction of peak performance really

obtainable, values of benchmarks, and so on. Such approach can be defined

Performance Model Approach, and it is used, for example, in [20].

2. use of data related to the performances of previous executions. As an example, it is possible to use a database, in which, for every computational resource selected in the time, are stored the average performances actually obtained, and the standard deviation that can be used as estimate of the eventual deviation from the average value. The described approach can be defined Historical Approach Performance, and is used, for example, in [23].

As previously said, because the dynamic nature of a computational grid, during the execution of the application it is necessary to check the actually obtained performances in order to face up them with those stated in the contract. The step of monitoring is carried out by means of a suitable tool defined as process monitor. More precisely, at run time, it is necessary that periodically the monitor controls if the conditions of the performance contract have been respected. In fact, to know that the contract was not validated only at the end of the execution, it is not useful, while a sufficiently frequent check allows to take suitable actions (e.g.. the migration of the application from a resource overloaded to another one, with the definition of a new contract). That can be made using suitable tools like Automated Instrumentation and Monitoring System (AIMS) [28], Autopilot [21], Paradyn [19] or the commercial tool Vampir. We note, anyway, that all of them are based on the concept of instrumentation of the code, that is on the insertion of calls to library functions able to capture given information from the running code and to transmit them to a process monitor or a visualization tool.

Beyond the tools for the run time remote monitoring of applications, we remember the Performance Application Programming Interface [6] (PAPI), and the Performance Counter Library (PCL) [3]. Both these libraries supply software interfaces to access hardware counters performance for most microprocessors. Such libraries can eventually be used in combination with one of the monitoring tools previously described, in order to gain detailed information also on the use of the hardware.

3. A Grid-oriented component based programming environment

Component programming model is a well known paradigm to develop applications. This approach, that can be considered as an evolution of the object-oriented model, that allow to build applications by binding independent software module (the components) that interact with other components means of well defined interfaces (the ports) according a set of specific rules of the programming environment (the framework). The separation of the support code implemented into the framework from the application code into the components allows to the user to focus the attention on the application, avoiding to deal with environment dependent details [7]. Because the components describing the application can be implemented onto separate hardware and software environment, the component programming model is also a very promising approach to develop grid oriented programming environment [ 13,14 ]. So a new grid oriented component based programming environment is the

focus of an Italian research project [ 25 ], where we are currently working to integrate a Performance Contracts System into the programming environment.

As already mentioned, the key role of the framework is to shade the details of the programming environment, by exposing only the services required to implement the components. In a grid oriented programming environment, therefore, by side to the classical services related to the cycle life of the components (instantiation, resource allocation, …), the framework has to provide more sophisticated grid oriented services like resources discovery, remote data access as well the actions concerning the application structuring and rescheduling. As already said in section 2, in a grid enabled application, these services are in charge of the Application Manager, that in this contest has a natural implementation in the framework. It is important to note that the middleware Globus [11] will be integrated in the framework in order to address the problems related to the access of the geographically distributed resources. In the Xxxx.xx programming model, the components are supplied with several types of ports [25]:

1. Remote Procedure Calls (RPC) interfaces. These are the classical CCA-like ports

mainly for client-server applications [7]. These interfaces define the kind of services that the component provides or uses.

2. Stream interfaces. This kind of interfaces allows the unidirectional communication of a data stream between two components. This kind of interfaces allows a better use of the bandwidth in case of high latency networks.

3. Events interfaces. These interfaces are used for the interaction of the components with the framework. The asynchronous events of a computational grid (e.g. the failure of resources) can be communicated to the components through these interfaces in order to take eventual actions for the rescheduling of the application on different resources.

4. Configuration interfaces. These interfaces allow to the Application Manager to access and to modify information and data inside the components and can be used for the reconfiguration steps (e.g. stop and restart of the application on other resources).

contr. devel.

monitor

framework

config. ports RPC/streams ports events ports

Fig. 2. integration of the Performance Contract System in to the Xxxx.xx environment

In order to integrate the Performance Contracts System, described in Section 2, into the Xxxx.xx programming environment we firstly note that, while the application can be developed assembling the components directly by using the RPC or the streams interfaces, the software units composing the Performance Contract Systems (monitor and contract developer) and all related files and data structure can be implemented directly in the framework interacting with the application through the Configuration interfaces

In Figure 2 is shown an example of application with three components (namely C1, C2 and C3). The components exchanges their data by means of the streams port (black line) while the monitor and the contract developer access the data into the components by means the configuration ports (dotted lines).

More precisely, referring to the integration of the monitor in the environment, it is possible to add the components with proper scripting annotation, specific for the application (e.g. number of floating point operations in each iteration, number of communications,…), reporting what data have to be monitored. These information are accessed by the monitor through the Configuration ports and are combined with the information acquired directly from the middleware implemented in the framework (e.g. number of processors to be use, kind of networks,…) and/or from the performance database, in order to define the Performance Contract. Through the same ports, the components provide to the monitor also the run time values of the data to be monitored, in order to realize the monitoring process. In such a way the monitor can be based on a general purpose and application-independent template depicted in Figure 3. An analogous approach can be used for the Performance Contract developer

▪ Acquire from the components the data to be monitored through the configuration interfaces

▪ Acquire from the middleware the features of the resources to be use

▪ Acquire from the Performance Contract the values to be monitored

▪ Define the step time to get run time data from the components

▪ For each step time

• Get run time values of the data to be monitored through the configuration interfaces

• Test the values with the Performance Contract

• if violation occurs apply violation policies

▪ endfor

Fig. 3 template for a general purpose monitor for grid applications

4. Computational experiments

Our experiments was carried out on a preliminary version (ASSIST-CL 1.2) of the Xxxx.xx environment [26]. The ASSIST programming model is based on a combination of the concepts of structured parallel programming and component-based programming. An ASSIST program is structured as a graph, where the nodes are the

components and the edges are the component abstract interfaces, defined in terms of typed I/O streams. The basic unit of an ASSIST program is a component named parmod (parallel module), which allows to represent different forms of parallel computation. The user interface of the ASSIST environment is a coordination language, named ASSIST-cl.

A layered software architecture has been implemented to support the above programming model on the target hardware architectures, including SMPs, MPPs, and NOWs. An ASSIST-cl code is compiled and then it is loaded and run onto the target architecture by a Coordination Language Abstract Machine (CLAM). The CLAM is decoupled from the target hardware by a run-time support, named Hardware Abstraction Layer Interface (HALI), which currently exports functionalities from the underling software layers. The ASSIST compiler translates ASSIST-cl source code into C++/HALI processes and threads, using predefined implementation templates. In running this code, the CLAM uses all the facilities provided by HALI, making no assumptions on the existence of other software running on the same nodes and competing to use the same resources. Finally the ACE library supplies standard routines to exchange data between processing elements with different architectures [22]. This is the layer of software that will be substituted with the middleware Globus in the Xxxx.xx programming environment.

To monitor the contract, we used the Autopilot library . This is a software

environment for the adaptive run time control of geographically distributed applications. Such package is constituted by software items that allow to communicate data of programs in execution on parallels and/or distributed environment to a process monitor. Such software items are said sensors. Furthermore, Autopilot is also able to modify the value of the variable of the executing applications, by means of the so-called actuators. We chosen such tool, because the presence of the actuators is fundamental for example in the rescheduling step, when we are in presence of a violation of the performance contract. It is important to note also that Autopilot library does not introduce significant overhead in the software environment [21] and it uses the same middleware Globus that will be used in the Xxxx.xx environment. In the following Figure 4 the software architecture to realize our experiments is shown. In such an architecture it should be noted the role of the Autopilot library used to realize the monitor process, in accordance with Figure 2.

grid resources

ACE

Globus

HALI

CLAM

Auto-pilot

ASSIST-cl application

framework

Fig. 4. Software architecture of ASSIST whit the Autopilot library

The computational kernel we used for our tests is based on the Coniugate Gradient (CG) algorithm, implemented in a routine of the the parallel library Meditomo

realized to be used in the medical imaging application MediGrid [4,5], that reconstructs 64 independent bi-dimensional images, using 10 iterations of CG for every image, for a total of 640 iterations. Features of the matrices involved are: sparsity, not structure ness, order n =103. For this problem we developed a parmod for the reconstruction of the 64 images, where the workload among the processors is distributed dynamically by using a farm: a parallel construct available in ASSIST. With this construct, each of the 64 images appearing on the input stream of the parmod is processed, independently from the other ones, by the first free processor.

As previously mentioned, and following a consolidate way, to define a contract it is necessary to know something about the past, in the sense of a historical database containing info regarding performances of previous executions.

Table 1. executions time for the reconstruction of the 64 images

P=1

P=2

P=4

P=8

P=12

execution time

without I/O (TP)

2494

1261

629

313

234

Table 1 shows a very simple example of record of such database, reporting the execution time TP, in seconds on P processors, of the computational kernel on a dedicated Linux Beowulf cluster with 12 Pentium 2 processors running at 550 MHz, connected by a Fast Ethernet switch at 100 Mbit/sec. We emphasize that such a times does not consider the I/O phases before and after each conjugate gradient. Such a values confirm however the natural parallelism of our problem, because we

found that T1 P TP .

On the basis of those data, and referring to the definition of Performance Contract given in Section 2, we can state:

• given P processors (the computational resources)

• able to execute the given application in 2494/P secs (the capability of the resources)

• and an application based on 640 iterations of the Coniugate Gradient the performance contract establishes that

• one iteration of the conjugated gradient has to be be executed in 2494/640=3.9

seconds independently from the number of processors P.

The monitored data are those ones defined in the Performance Contract, that is the execution time (Wall Clock Time) of one iteration of the conjugated gradient. By integrating the Autopilot sensors in the routine, we were able to carry out a set of experiments on the Beowulf machine, accessing runtime by means of the monitor, the Wall Clock Time of the execution of one iteration of the conjugated gradient every 40 seconds. After 150 seconds, one of the four processors (Node 1) has been overloaded by a process, stranger to the application, that engages the CPU for approximately 120 seconds before ending. Such overload is aimed to simulate the dynamical nature of the computational environment, in order to check if the performance contracts system, in this case, is able to finding the violation of the contract. The codes of all software modules written for such experiments are available in Appendices A through E.

In Figure 5 the results of our test are reported, where on the x-axis is reported the time and on the y-axis is reported the execution time for one iteration of the Conjugate Gradient as caught by the monitor in 4 processors. Further is reported the value of the Performance Contract (PC). It can be view that, when the nodes of the Beowulf are not overloaded with other applications, the monitored values of the execution time agree with those stated in the Performance Contract. Moreover it is possible to observe that when the Node1 is overloaded with other applications, the actual value of the execution time is very different from that received from the monitor. Such first experiments confirm that our performance contracts system is able to define a realistic contract, able to preview the performance of the application in normal situation, and also to find violations of the contract itself. Such results are encouraging for future developments of the aspects related to the performance contracts system, as for example, the definition and implementation of suitable strategies to face violations of the contract itself.

time (secs) for 1 iteration of the CG

9,00

8,00

7,00

6,00

5,00

4,00

3,00

2,00

1,00

0,00

contract violation

20 60 100 140 180 220 260 300 340

time(seconds)

node1 node2 node3 node4

Fig. 5. results of the monitoring process

References

1. X. Xxxx, X. Xxxxxx, X. Xxxx, X. Xxxxxxxx - Specifyng and Monitoring GRADS contracts - available to the URL xxxx://xxxxxxxxx.xx.xxxx.xxx/xxxxx/xxxxxxxxxxxx/xxxx0000.xxx

2. X. Xxxxxx, To Xxxxx, X. Xxxxxx, X. Xxxxxxxx, I. Xxxxxx, X. Xxxxxx, X. Xxxxxxx, X. Xxxxxxx, X. Xxxxxxxxx, X. Xxxxxx-Xxxxxxx, X. Xxxx, X. Xxxxxxx, X. Xxxxxx - The Grads Project: Software support for High Performance Grid Applications - Int. Journal on High Performance Applications. Vol 15 (2001), pp. 327-344.

3. X. Xxxxxxxxxx, X. Xxxx - PCL - The Performance Counter Library: a Common Interface to Access Hardware Performance Counters on Microprocessors (Version 2,2) - TR available to the URL: xxxx://xxx.xx-xxxxxxx.xx/xxx/XXX/xxx/xxx/xxx.xxxx

4. X. Xxxxxxx,X.Xxxxxxx, X.Xxxxxxxxxxxx, X.X’Xxxxx, X.Xxxxxxxxx, X.Xxxxxxxxxx, X.Xxxxxxxx, X.Xxxxx, and X.Xxxxx – A Grid-Based RPC System for Medical Imaging - Parallel and Distributed Scientific and Engineering Computing: Practice and Experiences, Advances in Computation: Theory and Practice, vol. 15, X.Xxx and X.Xxxx (eds.), 2004, pp. 189-204.

5. X. Xxxxxxxx, X. Xxxxxxx, X. Xxxxxxxxxxxx, L, D’Amore, X. Xxxxx - Parallel Software for 3D SPECT imaging based on the 2D + 1 approximation of collimator blur – Ann. Univ. Ferrara, sez. VII, Sci. Mat. Vol. XLV, 2000

6. X. Xxxxxx, X. Xxxxx, X. X have, X. Xxxxx, - PAPI: a Portable Interface to Hardware Performance Counters - Proceedings of Department of Defense HPCMP Users Group Conference, 1999

7. CCA Forum Home page. xxxx://xxx.xxx-xxxxx.xxx

8. X. Xxxxxx et al. – New Grid Scheduling and Rescheduling Methods in GraDS Project – available at URL xxxx://xxxxxxxx.xxx.xxx.xxx/000000.xxxx

9. I. Xxxxxx - What is the Grid? A three point checklist - available at URL xxxx://xxx- xx.xxx.xxx.xxx/xxxxxxx/Xxxxxxx/XxxxXxXxxXxxx.xxx

10. I. Xxxxxx , X.Xxxxxxxxx - The Grid: Blueprint for a New Computing Infrastructure - Xxxxxx and Xxxxxxx 1998

11. I. Xxxxxx , X.Xxxxxxxxx - Globus: a metacomputing infrastructure toolkit - Int. Journal on Supercomputing Application, vol. 11 (1997), pp. 115-128

12. I. Xxxxxx, X. Xxxxxxxxx, X. Xxxx, X. Xxxxxx – The Physiology of the Grid: an Open Grid Services Architecture for Distributed Systems Integration. Global Grid Formum, 2002

13. X. Xxxxxxxx, X. Xxx, A. Xxxxx, X. Xxxxxxxx, Xxxx Xxxxxxxxxx - ICENI: An Open Grid Service Architecture Implemented with Jini – Supercomputing 2002

14. X. Xxxxxxxxxxx, X. Xxxxxxxx, X. Xxxx, X. Xxxxxxxxx, X. Xxxxxx, and X. Xxxxxxx : XCAT 2.0: A Component-Based Programming Model for Grid Web Services.. Technical Report-TR562, Department of Computer Science, Indiana University. Jun 2002.

15. X. Xxxxxxx, X. Xxxxxxxxx - MARS, framework for minimizing the job execution time in a metacomputing environment- Future Generation Computer Systems, vol. 12 (1996), pp. 87-99

16. X. Xxxxxxx, X. Xxxxxx, X. Xxxxxxx - Predictive Application Modeling Performance in a Computational Grid Environment - Eighth IEEE Int. Symp. On High Performance Distributed Computing (1999), pp. 47-54

17. C. Xx, X. Xxxx - Compact Application Signature for Parallel and Distributed Scientific Codes - Proc. of Supercomputing 2002, (SC2002), Baltimore

18. X. Xxxxxx, X. Xxxx - Monitoring Large Systems via Statistical Sampling - Proc. LACSI Symposium, Fe Saint, 2002

19. X. Xxxxxx, X. Xxxxxxxxx, X. Xxxxxxxx, X. Xxxxxxxxxxxxx, X. Xxxxx Xxxxx, X. Karavanic,

K. Xxxxxxxxxxxxxx, X. Xxxxxxx - The Paradyn Parallel Performance Measurement Tools

- IEEE Computer vol. 28 (1995) pp. 37-46

20. X. Xxxxxxx, X. Xxxxxxxxx, X. Xxxxxxxx, X. Xxxxx, X. Xxxx, X. Xxxxx, X. Xxxxxxxx - Numerical Libraries and the Grid: The GrADS Experiment with ScaLAPACK, - Technical report UT-CS-01-460, 2001

21. X. Xxxxxx, X. Xxxxxx, X. Xxxxxxx, X. Xxxx - Autopilot: Adaptive Control of Distributed Applications - Proc. of High Performance Distributed Computing Conference, 1998, pp. 172-179

22. X.X.Xxxxxxx, X. Xxxxxxxx, E. Al-Xxxxx – Object Oriented components for high speed network programming – in proc. of 1st conf. on OO technology and systems (1995)

23. X. Xxxxx, I Xxxxxx X. Xxxxxx. - Predicting application run times using historical information. - Proc. Of the IPPS/SPDP' 98 workshop on job scheduling strategies for parallel processing (1998)

24. X. Xxxxxxx and X. Xxxxxxxx – A performance oriented migration framework for the grid - Proceedings of the 3st International Symposium on Cluster Computing and the Grid, 2003

25. X. Xxxxxxxxx – High Performance Grid Programming Environments: The Xxxx.xx ASSIST Approach , invited lecture, ICCSA 2004.

26. X. Xxxxxxxxx – The programming model of ASSIST, an environment for parallel and distributed portable applications – Parallel Computing, vol. 28 (2000), pp. 1709-1731

27. X.Xxxxxxxx, X. Xxxx, X. Xxxxxx, X. Xxxx – Performance contracts: predicting and monitoring application behaviour – Proc. IEEE/ACM Second Intern. Workshop on Grid Computing, Denver, 2001, Springer Verlag LNCS, vol. 2242, pp. 154-165

28. J. C. Xxx, X. Xxxxxxx and X. Xxxxxxxxx. "The Automated Instrumentation and Monitoring System (AIMS) -- Version 3.2 Users' Guide". NAS Technical Report. NAS-97- 001, January 1999

Appendice A

Codice assist-cl che descrive l’applicazione mediante un grafo di 3 componenti (due sequenziali e un parmod

#define MAX_ITER 63

/****************** main con il grafo assist ******************/ generic main() {

stream long str_A2B; stream double str_B2C;

genera ( output_stream str_A2B); elabora (input_stream str_A2B output_stream str_B2C); stampa (input_stream str_B2C);

}

/************ modulo per la generazione di 64 interi **************/ genera (output_stream long str_A2B) {

Fgenera (output_stream str_A2B);

}

proc Fgenera (output_stream long str_A2B)

$f77{

}f77$

DO I=0,MAX_ITER

assist_out (str_A2B, i); print*,'generato',i

ENDDO

/********** modulo per la stampa dei tempi ********************/

stampa (input_stream double str_B2C) { Fstampa (in str_B2C);

}

proc Fstampa (in double str_B2C )

$c++{

printf("stampa : %f\n",str_B2C);

}c++$

/************ parmod per i gradienti coniugati ********************/

parmod elabora (input_stream long str_A2B output_stream double str_B2C) { topology none Pv;

do input_section {

guard1: on , , str_A2B {

distribution str_A2B on_demand to Pv;

}

} while (true) virtual_processors {

elab1 (in guard1 out str_B2C) { VP {

Felab (in str_A2B output_stream str_B2C);

}

output_section {

collects str_B2C from ANY Pv;

}

proc Felab (in long str_A2B output_stream double str_B2C)

$f77{

DOUBLE PRECISION hT1,hT2

integer strA strA=str_A2B

CALL gradcon(strA,hT1,hT2) assist_out(str_B2C,hT1); assist_out(str_B2C,hT2);

}f77$

Appendice B

Il listato del codice f77 che realizza la ricostruzione 2D del gradiente coniugato. Il listato contiene la strumentazione Autopilot, ottenuta mediante i wrapper riportati in Appendice D

SUBROUTINE gradcon(str,T1,T2) c inizio codice

C Programma di ricostruzione 2D gradiente coniugato

c legge l'input da fce.inp e da un sinogramma specificato in fce.inp c23456789012345678901234567890123456789012345678901234567890123456789012

c iterazioni CGMM imitazione e semplificazione RECLBL

c PARAMETEddR(NANGP=120,NDIMUP=128,KDIMUP=128,NBIN=27,NPROF=20, c &NLEV=20,NANKDI=NANGP*KDIMUP,NDINDI=NDIMUP*NDIMUP,ITOT= 000,

x &XXx0.00*XXXXXXxXXXXx0,X0Xx-00,X0Xx00,XXXx00,XXXX0xXXXXx0,XXXXx00)

xxxxxxx xxxxx, xxxx ,xxx

include '/users/home4/laccettiwp8/LAPEGNA/form.var' DIMENSION xx(NDINDI)

CHARACTER*50 acc,OUTFILE

DOUBLE PRECISION T,SECONDI,TCAL,T1,T2 COMMON /SINOGRAMMI/ ISIN

DATA OUTFILE/'xx.bin. '/

TCAL=0.0D0

insin = str ISIN=INSIN

c lettura input

CALL stac(acc,llc,itr,iwgt,th,niter,irat)

IF (irat.LT.1) STOP ' valore illegale di rate' WRITE (6,*) ' Preparo le iterazioni...'

c inizializzazioni varie

CALL setitc(acc,llc,itr,iwgt,th) WRITE (6,*) ' ...finito!'

WRITE (6,*) ' Eseguiro''',niter,' iterazioni.' c esecuzione iterazioni CG

T = secondi() T1=T

CALL cgcec(xx,niter,irat) c T = secondi() - T

T = secondi()

T2=T

c assist_out(str_B2C,T); TCAL = TCAL + T

IF (ISIN.LT.10) THEN WRITE(OUTFILE(8:8),'(I1)') ISIN

WRITE (OUTFILE(9:10),'(2A)') '.P'

WRITE (OUTFILE(11:12),'(I2.2)') NPROCS INFILEL=12

ELSE

WRITE(OUTFILE(8:9),'(I2)') ISIN

WRITE (OUTFILE(10:11),'(2A)') '.P'

WRITE (OUTFILE(12:13),'(I2.2)') NPROCS INFILEL=13

END IF

WRITE (6,*) ' Uscita finale in ',OUTFILE(1:INFILEL) CALL wwrrr4(xx,NDINDI,OUTFILE(1:INFILEL))

jsin =insin print*,'elaborato ',insin RETURN

END

include '/users/home4/laccettiwp8/LAPEGNA/reawri.txt' include '/users/home4/laccettiwp8/LAPEGNA/len_trum.txt'

c letture di dati ed alcune inizializzazioni SUBROUTINE stac(acc,llc,itr,iwgt,th,niter,irat)

include '/users/home4/laccettiwp8/LAPEGNA/form.var' DIMENSION pp(NANKDI)

COMMON /aaa/sine(NANGP),cosine(NANGP),rfac(NLEV1*ITOT),

+ IDX(NPROF*2)

COMMON /bbb/pixinv,NGH,RNLEV,HNLEV,NLEVN,HPROF,zn,axis,kmov COMMON /ccc/IL1(NDIMUP,3),ROOT(LPZ),INDX(L1X:L2X),IMIN,IMIN1,IMAX COMMON /wrk/zaz(NANKDI+4*NDINDI)

COMMON /SINOGRAMMI/ ISIN CHARACTER*60 acs,acc,INFILE EQUIVALENCE (zaz,pp)

DATA INFILE/'/users/home4/laccettiwp8/LAPEGNA/RUN/pro_dvc. '/ OPEN (37,file='/users/home4/laccettiwp8/LAPEGNA/RUN/fce.inp'

* ,status='old') READ (37,*)

READ (37,'(A)') acs

lls = len_trum(acs) IF (ISIN.LT.10) THEN

WRITE(INFILE(46:46),'(I1)') ISIN INFILEL=46

ELSE

WRITE(INFILE(46:47),'(I2)') ISIN INFILEL=47

END IF

c IF (INFILE(9:9).EQ.' ') INFILE(9:9) = '0'

c WRITE (6,'(A)') ' sinogramma '//INFILE(1:50) print*, ' sinogramma ', INFILE(1:50)

READ (37,*)

READ (37,*) axisu,rhu,rh

WRITE (6,*) ' axisu =',axisu,' raggio collimatore =',rh WRITE (6,*) ' raggio ricostruzione =',rhu

IF (RHU+3.GT.RH) STOP ' abbassa raggio ricostruzione' READ (37,*)

READ (37,'(A)') acc

llc = len_trum(acc)

c WRITE (6,'(A)') ' file collimatore '//acc(1:llc) print*, ' file collimatore ' , acc(1:llc)

READ (37,*)

READ (37,*) itr,niter,irat IF (itr.EQ.1) THEN

WRITE (6,*) ' con precondizionamento'

ELSE

WRITE (6,*) ' senza precondizionamento'

END IF

READ (37,*)

READ (37,*) iwgt

IF (iwgt.EQ.1) THEN

WRITE (6,*) ' con pesi statistici'

ELSE

WRITE (6,*) ' senza pesi statistici' GO TO 3

END IF

READ (37,*) READ (37,*) th

WRITE (6,*) ' soglia minima sinogramma =',th

IF (th.LT.1.0e-10) STOP ' valore illegale di soglia'

3 CLOSE (37)

c qui legge le proiezioni

CALL rreer4(pp,XXXXXX,INFILE(1:INFILEL)) WRITE (6,'(A)') ' letto '//INFILE(1:INFILEL) NGH = NBIN/2

IF (2*NGH.EQ.NBIN) STOP ' NBIN deve essere dispari!' radius = sqrt(.5)*float(NDIMUP)

val = radius + 3. - axisu + float(NGH) kmov = val

IF (kmov.LT.1) STOP

+ ' non ammesso kmov < 1 : axisu troppo grande !' c WRITE(6,*)AXISU,RADIUS,VAL,kmov

axis = axisu + float(kmov) + .5 WRITE (6,*) ' axis =',axis

val = axis + radius + float(NGH) + 3. ival = val

c write(6,*)ival,NK

IF (NK.LT.ival) STOP ' aumenta NK : axisu troppo piccolo!' IF (NK.LT.KDIMUP+kmov) STOP ' aumenta NK !'

X XXXXXx00

X XXXXx00

XXXXX = FLOAT(NLEV) HNLEV = .5*RNLEV NLEVN = NLEV*NGH

cprj HPROF=NPROF*.5+.5 HPROF = NPROF*.5 + 1.

zn = 0.5*FLOAT(NDIMUP+1)

pig2 = 8*atan(1.0)

DO i = 1,NANGP

ang = ((i-1)*pig2)/NANGP sine(i) = sin(ang) cosine(i) = cos(ang)

END DO

c qui legge N52N.UNF

CALL legcln(acc(1:llc),pixinv) pixperpas = 1./pixinv

c qui determina i limiti per il disco di ricostruzione di raggio c RHU

c RHU = raggio disco di rocostruzione nell'immagine, RH e' il c raggio di

c curvatura del rivelatore DO I = 1,NDIMUP

IL1(I,1) = NDIMUP IL1(I,2) = 1

IL1(I,3) = 1 END DO

RH2 = RH*RH RHU2 = RHU*RHU INSD = 0

DO I = 1,NDIMUP DX = I - zn DX2 = DX*DX

DO J = 1,NDIMUP DY = J - zn DY2 = DY*DY

IF (DX2+DY2.LT.RHU2) THEN IF (INSD.EQ.0) THEN

IL1(I,1) = J INSD = 1

ELSE

IL1(I,2) = J

ELSE

END IF

IF (INSD.EQ.1) GO TO 57

END IF END DO

57 INSD = 0

END DO

DO I = 1,NDIMUP

IF (IL1(I,1).LE.IL1(I,2)) GO TO 58 END DO

STOP ' immagine vuota! non puo'' essere!'

58 IMIN = I

DO I = IMIN,NDIMUP

IF (IL1(I,1).LE.IL1(I,2)) IMAX = I END DO

WRITE (6,*) ' indici principali =',IMIN,IMAX IMIN1 = IMIN - 1

DO I = 1,NDIMUP

IL1(I,3) = IL1(I,1) - 1 END DO

IRH = RH

IRH1 = IRH + 1

WRITE (6,*) ' riempimento ROOT => ',IRH1 IF (IRH1.GT.LPZ) STOP ' aumenta LPZ'

DO I2 = 0,IRH

PZZ = I2

PZZ2 = PZZ*PZZ

ROOT(I2+1) = SQRT(RH2-PZZ2) - RH END DO

DO I = L1X,0

INDX(I) = 1 END DO

DO I = 1,NPROF INDX(I) = I

END DO

DO I = NPROF + 1,L2X INDX(I) = NPROF

END DO RETURN END

c significato ovvio !

SUBROUTINE zero(vv,ii) DIMENSION vv(ii)

DO i = 1,ii

vv(i) = 0.

END DO RETURN END

c settaggi iniziali e generazione del precondizionatore

SUBROUTINE setitc(acc,llc,itr,iwgt,th)

include '/users/home4/laccettiwp8/LAPEGNA/form.var'

DIMENSION pp(KDIMUP,NANGP),bb(NDINDI),bu(NDINDI),co(KDIMUP,NANGP) CHARACTER*50 acc

COMMON /aaa/sine(NANGP),cosine(NANGP),rfac(NLEV1*ITOT),

+ IDX(NPROF*2)

COMMON /wrk/zaz(NANKDI+4*NDINDI)

c gestione allegra del common wrk - ma tutto sembra funzionare !

EQUIVALENCE (zaz,pp), (zaz(NANKDI+1),bb),

+ (zaz(NANKDI+NDINDI+1),bu), (zaz(NANKDI+2*NDINDI+1),co)

IF (iwgt.EQ.1) THEN

c -----implementazione nuova SWT 3D c inizio con le proiezioni p in pp

DO k = 1,NANGP

psum = 0.

DO i = 1,KDIMUP

IF (pp(i,k).LT.0.) pp(i,k) = 0.

psum = psum + pp(i,k) END DO

psum = psum/KDIMUP

c write(6,*)k,psum theff = th

DO i = 1,KDIMUP

psav = pp(i,k)

pp(i,k) = 1./ (theff+psav) co(i,k) = psav*pp(i,k)

END DO END DO

c alla fine di questi do mi trovo in pp i termini diagonali c di C^(-1) ed in

c co il risultato di C^(-1) p (vedi Appendix) c -----fine implementazione nuova SWT 3D

c skip this, please !

ELSE

DO k = 1,NANGP

DO i = 1,KDIMUP

co(i,k) = pp(i,k) pp(i,k) = 1.

END DO END DO

END IF

c in bb metto F^(*) C^(-1) p CALL bckproc(bb,co)

IF (itr.EQ.1) THEN

DO i = 1,NLEV1*ITOT

rfac(i) = rfac(i)*rfac(i) END DO

c qui calcolo D^2, quadrato della matrice di precondizionamento, c sulla base

c della (3.11) che va a finire in bu CALL bckproc(bu,pp)

c rileggo i dati da N52N.UNF in quanto li ho dovuti elevare al c quadrato nel

c passo precedente; non preoccupatevi troppo di questo - sono c tecnicismi

CALL legcln(acc(1:llc),pixinv)

ELSE

c skip this, please !

DO i = 1,NDINDI

bu(i) = 1.

END DO END IF

c qui calcolo D^(-1) che lascio in bu; nell'algoritmo e' D^(-1) c che mi serve

c (vedi Appendix)

DO i = 1,NDINDI

IF (bu(i).EQ.0.) THEN GO TO 99

ELSE

bu(i) = 1./sqrt(bu(i))

END IF

bb(i) = bb(i)*bu(i)

99 END DO

c adesso in bb c'e' quello che in Appendix e' B (grassetto) RETURN

END

c ff = retroproiezione di pp ; ff = F^(*) pp c forse per adesso basta capire solo questo

SUBROUTINE bckproc(ff,pp)

include '/users/home4/laccettiwp8/LAPEGNA/form.var' DIMENSION ff(NDIMUP,NDIMUP),pp(NANKDI)

DIMENSION ps(NK)

COMMON /aaa/sine(NANGP),cosine(NANGP),rfac(NLEV1,ITOT),

+ IDX(NPROF,2)

COMMON /bbb/pixinv,NGH,RNLEV,HNLEV,NLEVN,HPROF,zn,axis,kmov COMMON /ccc/IL1(NDIMUP,3),ROOT(LPZ),INDX(L1X:L2X),IMIN,IMIN1,IMAX

CALL zero(ff,NDINDI) CALL zero(ps,NK)

mi = 0

DO m = 1,NANGP

DO i = 1,KDIMUP

ps(i+kmov) = pp(mi+i) END DO

c write(6,*)-m

zz = zn* (sine(m)-cosine(m)) + axis + IMIN1*cosine(m) cprj xx=zn*(sine(m)+cosine(m))+0.5-IMIN1*sine(m)

xx = zn* (sine(m)+cosine(m)) - IMIN1*sine(m)

DO j = IMIN,IMAX

zz = zz + cosine(m) xx = xx - sine(m)

pz = zz - (IL1(J,3))*sine(m)

px = xx - (IL1(J,3))*cosine(m)

DO i = IL1(J,1),IL1(J,2)

pz = pz - sine(m) px = px - cosine(m)

c pz=((2*j-NDIMUP-1)*cosine(m)-(2*i-NDIMUP-1)*sine(m))/2+axis c px e' -x e cresce con la distanza del pixel dal rivelatore

c px=(-(2*j-NDIMUP-1)*sine(m)-(2*i-NDIMUP-1)*cosine(m))/2 k = pz

PZZ = ABS(PZ-AXIS) J1 = PZZ

I1 = J1 + 1 I2 = I1 + 1

VAL = (I1-PZZ)*ROOT(I1) + (PZZ-J1)*ROOT(I2)

MK = (px+VAL)*pixinv + HPROF MP = INDX(MK)

IND1 = 1. + RNLEV* (pz-FLOAT(k))

C IF(IND1.EQ.NLEV1)IND1=NLEV

IC1 = IDX(MP,2)

DO KVV = K - IDX(MP,1),K + IDX(MP,1)

ff(i,j) = ff(i,j) + RFAC(IND1,IC1)*PS(KVV) IC1 = IC1 + 1

END DO END DO

END DO

mi = mi + KDIMUP END DO

RETURN END

c subroutine del CG ; vedi APPENDIX del .ps SUBROUTINE cgcec(x,itmax,irat)

DOUBLE PRECISION T,SECONDI, timeiter, timestart, dsin include '/users/home4/laccettiwp8/LAPEGNA/form.var' DIMENSION x(NDINDI)

DIMENSION wgt(NANKDI),r(NDINDI),tr(NDINDI),s(NDINDI),z(NDINDI)

c wgt => C^(-1); r => R; tr => D^(-1); s => S; z => Z (vedi Appendix) c i nomi sono diversi rispetto a setitc, ma gli stessi rispetto ad

c Appendix

COMMON /wrk/zaz(NANKDI+4*NDINDI) COMMON /SINOGRAMMI/ ISIN

EQUIVALENCE (zaz,wgt), (zaz(NANKDI+1),r),

+ (zaz(NANKDI+NDINDI+1),tr), (zaz(NANKDI+2*NDINDI+1),s),

+ (zaz(NANKDI+3*NDINDI+1),z)

c registrazione sensori autopilot timeiter =0.

dummy = ap_setup() dsin = isin

dummy = ap_regsensors(timeiter,dsin) iter = 0

CALL zero(x,NDINDI)

c calcolo di S_0 = D^(-1) B DO j = 1,NDINDI

s(j) = r(j)*tr(j) END DO

c calcolo del numeratore di (A.3) e denominatore di (A.6) bkden = sclpr1(r,r,NDINDI)

10 iter = iter + 1 timestart = secondi()

WRITE (6,*) iter

c sandwich proiezione - moltiplicazione per C^(-1) - retroproiezione c cioe' passo (A.2)

T = secondi() CALL prbcer(s,z,wgt)

T = secondi() -T

! write(*,*)' Tempo di prbcer = ',T x xxxxxxx xxx xxxxxxxxxxxx xx (X.0)

akden = sclpr1(s,z,NDINDI) ak = bkden/akden

c passi (A.4) ed (A.5)

DO j = 1,NDINDI

x(j) = x(j) + ak*s(j)

r(j) = r(j) - ak*tr(j)*z(j) END DO

ieq = iter/irat

c scarica su file iterate intermedie

! IF (iter.EQ.irat*ieq) CALL FUSER(X,ITER) c esce quando iter raggiunge il valore massimo

IF (iter.GE.itmax) goto 789

c calcolo del denominatore di (A.6) bknum = sclpr1(r,r,NDINDI)

c calcolo di (A.6)

bk = bknum/bkden c passo (A.7)

DO j = 1,NDINDI

s(j) = tr(j)*r(j) + bk*s(j) END DO

bkden = bknum

timeiter = secondi() - timestart print*,'--->>', timeiter

GO TO 10

789 continue

dummy = ap_unregsensors() END

c prodotto scalare tra a e b FUNCTION sclpr1(a,b,nx) DIMENSION a(nx),b(nx)

sclpr1 = 0.

DO i = 1,nx

sclpr1 = sclpr1 + a(i)*b(i) END DO

RETURN END

SUBROUTINE prbcer(ff,oo,wgt) DOUBLE PRECISION T,SECONDI

c calcola oo=F(*) wgt F ff ove wgt e' C^(-1); realizza il passo (A.2) c dell'Appendix

include '/users/home4/laccettiwp8/LAPEGNA/form.var' DIMENSION ff(NDIMUP,NDIMUP),oo(NDIMUP,NDIMUP),wgt(NANKDI) DIMENSION ps(NK),pw(NK)

COMMON /aaa/sine(NANGP),cosine(NANGP),rfac(NLEV1,ITOT),

+ IDX(NPROF,2)

COMMON /bbb/pixinv,NGH,RNLEV,HNLEV,NLEVN,HPROF,zn,axis,kmov COMMON /ccc/IL1(NDIMUP,3),ROOT(LPZ),INDX(L1X:L2X),IMIN,IMIN1,IMAX

CALL zero(oo,NDINDI) CALL zero(pw,NK)

mi = 0

DO m = 1,NANGP

CALL zero(ps,NK)

DO i = 1,KDIMUP

pw(i+kmov) = wgt(mi+i) END DO

c write(6,*)m

zst = zn* (sine(m)-cosine(m)) + axis + IMIN1*cosine(m) cprj xst=zn*(sine(m)+cosine(m))+0.5-IMIN1*sine(m)

xst = zn* (sine(m)+cosine(m)) - IMIN1*sine(m) zz = zst

xx = xst

T = secondi()

DO j = IMIN,IMAX

zz = zz + cosine(m) xx = xx - sine(m)

pz = zz - (IL1(J,3))*sine(m)

px = xx - (IL1(J,3))*cosine(m)

DO i = IL1(J,1),IL1(J,2)

pz = pz - sine(m) px = px - cosine(m) k = pz

PZZ = ABS(PZ-AXIS) J1 = PZZ

I1 = J1 + 1 I2 = I1 + 1

VAL = (I1-PZZ)*ROOT(I1) + (PZZ-J1)*ROOT(I2)

MK = (px+VAL)*pixinv + HPROF MP = INDX(MK)

IND1 = 1. + RNLEV* (pz-FLOAT(k))

C IF(IND1.EQ.NLEV1)IND1=NLEV

IC1 = IDX(MP,2) LSAXPY=2*IDX(MP,1) + 1 CALL SAXPY(LSAXPY,FF(I,J),

+ RFAC(IND1,IC1),NLEV1,

+ PS(K - IDX(MP,1)),1)

! DO KVV = K - IDX(MP,1),K + IDX(MP,1)

! PS(KVV) = PS(KVV) + FF(I,J)*RFAC(IND1,IC1)

! IC1 = IC1 + 1

! END DO

END DO END DO

T = secondi() -T

! write(*,*)' Tempo primo ciclo = ',T

c qui ha finito la proiezione su ps, ora fa la retroproiezione su oo c qui si taglia a KDIMUP bins

c qui sotto moltiplica per C^(-1) rappresentato da wgt DO i = 1,NK

ps(i) = ps(i)*pw(i) END DO

zz = zst xx = xst

T = secondi()

DO j = IMIN,IMAX

zz = zz + cosine(m)

xx = xx - sine(m)

pz = zz - (IL1(J,3))*sine(m)

px = xx - (IL1(J,3))*cosine(m)

DO i = IL1(J,1),IL1(J,2)

pz = pz - sine(m) px = px - cosine(m) k = pz

PZZ = ABS(PZ-AXIS) J1 = PZZ

I1 = J1 + 1 I2 = I1 + 1

VAL = (I1-PZZ)*ROOT(I1) + (PZZ-J1)*ROOT(I2)

MK = (px+VAL)*pixinv + HPROF MP = INDX(MK)

IND1 = 1. + RNLEV* (pz-FLOAT(k))

C IF(IND1.EQ.NLEV1)IND1=NLEV

IC1 = IDX(MP,2) LSDOT=2*IDX(MP,1) + 1

oo(i,j) = oo(i,j) +

+ SDOT(LSDOT,RFAC(IND1,IC1),NLEV1,

+ PS(K - IDX(MP,1)),1)

! DO KVV = K - IDX(MP,1),K + IDX(MP,1)

! oo(i,j) = oo(i,j) + RFAC(IND1,IC1)*PS(KVV)

! IC1 = IC1 + 1

! END DO

END DO END DO

T = secondi() -T

! write(*,*)' Tempo secondo ciclo = ',T mi = mi + KDIMUP

END DO RETURN END

c legge N52N.UNF

SUBROUTINE legcln(file,pixinv)

include '/users/home4/laccettiwp8/LAPEGNA/form.var' COMMON /aaa/sine(NANGP),cosine(NANGP),rfac(NLEV1*ITOT),

+ IDX(NPROF*2)

CHARACTER file* (*)

ll = len_trum(file)

OPEN (37,file=file(1:ll),status='old',form='unformatted')

WRITE (6,*) ' leggo unformatted da ',file(1:ll)

READ (37,err=15) pixperpas pixinv = 1./pixperpas

READ (37,err=15) IDX

READ (37,err=15) rfac

READ (37,end=13,err=15) NN,NN1,NN2,NN3 CLOSE (37)

IF (nn.NE.NBIN) STOP

+ ' marchio errato alla fine del file collimatore' IF (nn1.NE.NPROF) STOP

+ ' marchio errato alla fine del file collimatore' IF (nn2.NE.NLEV) STOP

+ ' marchio errato alla fine del file collimatore' IF (nn3.NE.ITOT) STOP

+ ' marchio errato alla fine del file collimatore' RETURN

13 STOP ' manca marchio alla fine del file collimatore!'

15 STOP ' questo file collimatore non va bene!' END

c scrive le iterate intermedie SUBROUTINE FUSER(X,J)

include '/users/home4/laccettiwp8/LAPEGNA/form.var' DIMENSION X(NDINDI)

CHARACTER ACH*6 CHARACTER CH*3

DATA ACH/'xx.rbb'/

WRITE (CH,'(I3)') J

IF (CH(1:1).EQ.' ') CH(1:1) = '0'

IF (CH(2:2).EQ.' ') CH(2:2) = '0' ACH(4:6) = CH(1:3)

WRITE (6,'(A)') ' uscita in '//ACH CALL wwrrr4(X,NDINDI,ACH)

RETURN END

c fine codice

Appendice C

Il codice C++ del monitor sviluppato mediante Autopilot

#include "ApGlobal.h" #include "ApProperties.h" #include "ApSensorClient.h"

// Prototype for the client data handler

void Handler( globus_nexus_endpoint_t* endpoint, globus_nexus_buffer_t* buffer,

globus_bool_t called_from_non_threaded_handler ); main(int argc, char **argv) {

int registred, attached;

cerr << "monitor partito" << endl;

if ( ApGlobal::startup( argv[0] ) != SUCCESS_ ) exit( 1 );

// specifica le proprieta' del sensore ApProperties propsensore( "meditomo", "beocomp"); propsensore.addProperty( "Name1", "timesensor" );

//propsensore.addProperty( "Name2", "isin" ); ApGlobal::lockCerrMutex();

cerr << "proprieta sensore definite" << endl; ApGlobal::unlockCerrMutex();

// crea la struttura dati per gestire i sensori ApSensorClient snsrClient( propsensore, &Handler ); registred = 0;

while( registred == 0 ) {

// vedi quanti sono i sensori effettivamente registrati

registred = snsrClient.requestMatchingStartpoints( ApGlobal::Nonblocking );

if ( registred != 0 ) {

// attaccati

snsrClient.attachClient();

attached = snsrClient.getNumberOfAttachedStartpoints( ); while( attached != 0 ) {

ApGlobal::lockCerrMutex(); cerr << endl;

cerr << "agganciato " << attached << " sensori su " << registred << endl; cerr << endl;

ApGlobal::unlockCerrMutex();

// prendi il valore dal sensore snsrClient.recordData(); sleep( 5 );

registred = snsrClient.requestMatchingStartpoints( ApGlobal::Nonblocking ); snsrClient.attachClient();

attached = snsrClient.getNumberOfAttachedStartpoints( );

}

ApGlobal::lockCerrMutex();

cerr << "sensori tutti staccati (" << attached << " su " << registred <<

")"<< endl ; ApGlobal::unlockCerrMutex();

} else {

ApGlobal::lockCerrMutex();

cerr << "non ci sono sensori registrati .." << endl; ApGlobal::unlockCerrMutex();

sleep( 1 );

} // fine if

} // fine while sui sensori registrati exit( ApGlobal::shutdown( ) );

cerr << "fine monitor" << endl;

}

Appendice D

routine di supporto (per il tempo e per i wrapper C-Fortran)

// function second.c

#include <stdio.h> #include <sys/param.h> #include <sys/types.h> #include <sys/times.h> #include <sys/time.h>

double secondi_()

{

struct timeval clock; struct timezone tzone; double temp;

gettimeofday (&clock,&tzone); temp=xxxxx.xx_sec+xxxxx.xx_usec*0.000001; return(temp);

}

// Wrapper per ilsetup do Autopilot e per

// la registrazione e la deregistrazione di un sensore

#include "stdio.h" #include "ApGlobal.h" #include "ApProperties.h"

#include "ApDoubleSensor.h"

extern "C" {

ApDoubleSensor *sensore; int ap_setup_(){

if ( ApGlobal::startup( "meditomo" ) !=

SUCCESS_ ) exit( 1 );

return 999;

}

int ap_regsensors_(double * timeiter, double * dsin){ int ndouble = 2, bufferFactor = 1;

double * data[2];

// specifica prop del sensore

ApProperties propsensore( "meditomo", "beocomp"); propsensore.addProperty ( "Name1", "timesensor" );

ApGlobal::lockCerrMutex();

cerr << "proprieta definite" << endl; ApGlobal::unlockCerrMutex();

data[0]=timeiter; data[1]=dsin;

// crea il sensore e registralo

static ApDoubleSensor timesensor("timesensor", propsensore, data[0], ndouble, bufferFactor ) ;

timesensor.registerStartPoint(); ApGlobal::lockCerrMutex();

cerr << "sensore registrato" << endl; ApGlobal::unlockCerrMutex();

sensore = &timesensor; return 999;

}

int ap_unregsensors_(){

cerr << " sono in unregsensor" << endl;

*sensore).unRegisterStartPoint(); ApGlobal::lockCerrMutex();

cerr << "sensore deregistrato" << endl; ApGlobal::unlockCerrMutex();

return 999;

}

Appendice E

Il file di configurazione ass.out.xml modificato per l’esecuzione su 4 nodi:

<?xml version="1.0" ?>

<!DOCTYPE assist_config SYSTEM "ASSIST.DTD">

<assist_config>

</modules>

<tree>

<instance_mod type_mod="mod" instance_id="ND000 genera" mod_id="genera" />

<instance_mod type_mod="parmod" instance_id="ND001 elabora" mod_id="elabora" />

<instance_mod type_mod="mod" instance_id="ND002 stampa" mod_id="stampa" />

</generic>

</tree>

<lib lib_id="lND000 genera_i686-pc-linux-gnu" type="assist" object_type="EXE"

url="/users/home4/laccettiwp8/tmpast/bin/i686-pc-linux- gnu/ND000 genera"></lib>

<lib lib_id="lND001 elabora_ism_i686-pc-linux-gnu" type="in" object_type="EXE"

url="/users/home4/laccettiwp8/tmpast/bin/i686-pc-linux- gnu/ND001 elabora_ism"></lib>

<lib lib_id="lND001 elabora_osm_i686-pc-linux-gnu" type="out" object_type="EXE"

url="/users/home4/laccettiwp8/tmpast/bin/i686-pc-linux- gnu/ND001 elabora_osm"></lib>

<lib lib_id="lND001 elabora_vpm_i686-pc-linux-gnu" type="replicated" object_type="EXE"

url="/users/home4/laccettiwp8/tmpast/bin/i686-pc-linux- gnu/ND001 elabora_vpm"></lib>

<lib lib_id="lND002 stampa_i686-pc-linux-gnu" type="assist" object_type="EXE"

url="/users/home4/laccettiwp8/tmpast/bin/i686-pc-linux- gnu/ND002 stampa"></lib>

</libraries>

<mod_lib_relation mod_id="genera">

<lib_ref lib_id="lND000 genera_i686-pc-linux-gnu" ></lib_ref>

</mod_lib_relation>

<mod_lib_relation mod_id="stampa">

<lib_ref lib_id="lND002 stampa_i686-pc-linux-gnu" ></lib_ref>

</mod_lib_relation>

<mod_lib_relation mod_id="elabora">

<lib_ref lib_id="lND001 elabora_ism_i686-pc-linux-gnu" ></lib_ref>

<lib_ref lib_id="lND001 elabora_osm_i686-pc-linux-gnu" ></lib_ref>

<lib_ref lib_id="lND001 elabora_vpm_i686-pc-linux-gnu" ></lib_ref>

</mod_lib_relation>

</bindings>

</structure>

</machines>

<map_instance_mod instance_id="ND000 genera" lib_id="lND000 genera_i686-pc-linux-gnu" >

<map_pools pool_ref="pool0" />

</map_instance_mod>

<map_instance_mod instance_id="ND001 elabora" lib_id="lND001 elabora_ism_i686-pc-linux-gnu" >

<map_pools pool_ref="pool0" />

</map_instance_mod>

<map_instance_mod instance_id="ND001 elabora" lib_id="lND001 elabora_osm_i686-pc-linux-gnu" >

<map_pools pool_ref="pool1" />

</map_instance_mod>

<map_instance_mod instance_id="ND001 elabora" lib_id="lND001 elabora_vpm_i686-pc-linux-gnu" >

<map_pools pool_ref="pool0" />

<map_pools pool_ref="pool1" />

<map_pools pool_ref="pool2" />

<map_pools pool_ref="pool3" />

</map_instance_mod>

<map_instance_mod instance_id="ND002 stampa" lib_id="lND002 stampa_i686-pc-linux-gnu" >

<map_pools pool_ref="pool1" />

</map_instance_mod>

</mapping>

</loading>

</assist_config>

Document Metadata

Table of Contents

Free Standard Templates

A Performance Contract System in a Grid Enabling Component Based Programming Environment

Document Metadata