March 31st, 1998
- Data Storage / Management
- Major Topics Discussed
- Why Generate Output?
- Parallel File Systems
- Alphabet Soup of File Systems
- Role of Library Linked to Application
- Role of Library Linked to Application con’t
- Speed vs. Flexible I/O
- I/O Tradeoffs
- I/O Tradeoffs con’t
- Potential Solutions to I/O Problems
- “Nominal C-plant” I/O con’t
- Potential Solutions to I/O Problems con’t
- Potential Solutions to I/O Problems con’t
- Visualization
- Scalable Visualization
- Sandia ASCI RED
Slide 1 – Data Storage / Management & Visualization
Charleston SOS Workshop
Slide 2 – Major Topics Discussed
- Parallel File Systems
- Apps I/O Libraries
- I/O Tradeoffs
- Potential I/O Solutions
Slide 3 – Why Generate Output?
- Internal Uses
- Checkpoint
- Postprocessing
- Interprocess/Partition Communication
- External Uses
- Use data for other uses/users
- Visualization
- Archival Storage
Slide 4 – Parallel File Systems
- PMF- Parallel Media Files
- One file on many disks
- PNF- Parallel Node Files
- many nodes access one file
- MNMF- Many Node to Many Files
- many nodes accessing many files
Slide 5 – Alphabet Soup of File Systems
- PFS Intel
- PIOFS/GPFS IBM
- XFS/XLV SGI
- AdFS DEC
- DFS, Galley, Passion, Veritas, “Sun FS”, Unicos, UFS, . . .
- Merits and Demerits of DFS, NFS, HDF
- All must be tailored to parallel SOS needs
Slide 6 – Role of Library Linked to Application
- (e.g. Silo, Exodus, HDF,. . . )
- Collective I/O
- Tie to App specific view of data
- Buffering
- Async I/O
- Blocking to match hardware needs
- Data Format Conversion
- Must consider performance
Slide 7 – Role of Library Linked to Application con’t
- Restructuring of Data
- M to N mapping
- Subsetting
- Compression/Decompression
Slide 8 – Speed vs. Flexible I/O
- Intermediate machine to convert
- Use most capable machine to restructure data
- Different use Model (Experiment)
- Careful Planning of Runs
- Use of funds
- $Cnodes + $Cnet + $IOsys
- What is balance on current MPPs
Slide 9 – I/O Tradeoffs
- Data volumes – unload in real-time
- Closeness to real-time viz
- Can you predict users data requests
- how to structure output
- How many users requests must be satisfied
Slide 10 – I/O Tradeoffs con’t
- How many users?
- where does the data reside
- how many copies (where physically)
- Funds
- How many retrievals
- How much effort should be directed to structuring data
- Different structures for different applications
Slide 11 – Potential Solutions to I/O Problems
- “Nominal C-plant” I/O
- 10 Compute Nodes/ One I/O node
- Parallel Media File System
- Block Server
- Directory Server
- Optimized for internal use, but also can be accessed externally
- Control & Data Separate
- Desire no local disk on compute nodes
Slide 12 –
- “Nominal C-plant” I/O con’t
- “Read & Broadcast capability
- Checkpoint mode
- Users have private data or environment to be attached
- Work still in progress
Slide 13 -Potential Solutions to I/O Problems con’t
- Third I/O Interconnect
- control – boot, diagnostics
- “backplane” – data net for computation
- I/O network
- Data path between Compute and I/O nodes
- I/O coordination & metadata
- To effect “Double Headedness” (I.e. non compute node access to storage media
- Increase compute node determinism
Slide 14 – Slide 13 -Potential Solutions to I/O Problems con’t
- Integrated Archival Storage System
- Specialized I/O nodes
- DIGITAL Server 8000 per 4-10 scalable units
- I/O nodes integrated at the scalable unit level
Slide 15 – Visualization
- ASCI
- Machine(s)
- ASCI
- Machine(s)
- Data
- Server
- buffer
- seamless transmission
Slide 16 – Scalable Visualization
Slide 17 – Sandia ASCI RED
Sandia ASCI RED