Skip to content

Commit

Permalink
Merge pull request #5 from JintaoLee-Roger/dev
Browse files Browse the repository at this point in the history
v1.1.7
  • Loading branch information
JintaoLee-Roger authored Jul 30, 2024
2 parents db0b2c9 + 9f667b2 commit 78adb30
Show file tree
Hide file tree
Showing 25 changed files with 1,425 additions and 663 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ _doxygenxml
docs/cigsegy
segyfile
__pycache__
todo.txt

test.*
thridpart/
Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ ENDIF(CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT)
message("Install Prefix: " ${CMAKE_INSTALL_PREFIX})

set(BUILD_PYTHON ON)
set(BUILD_TOOLS ON)
set(BUILD_TOOLS OFF)
set(ENABLE_OPENMP OFF)
set(BUILD_TEST OFF)

Expand Down
52 changes: 44 additions & 8 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,17 +83,11 @@ intervals, data format ...
.. Note::

If you are unsure about the values of some parameters,
you can use the default values, such as
you can ignore them and cigsegy will try to guess them automatically.

.. code-block:: python

>>> cigsegy.metaInfo('fx.segy', iline=9, xline=21)

You also can set ``use_guess=True`` to use the guessed parameters:

.. code-block:: python

>>> cigsegy.metaInfo('rogan.sgy', use_guess=True)
>>> cigsegy.metaInfo('fx.segy', iline=9, xline=21) # ignore istep, xstep, ...


4. Read the SEG-Y
Expand Down Expand Up @@ -139,6 +133,25 @@ There is often such a workflow:
# d is a numpy array, d.shape == (n-inlines, n-crosslines, n-time)
>>> cigsegy.create('out.segy', d, format=5, start_time=0, iline_interval=15, ...)


7. Access the SEG-Y file as a 3D numpy array, without reading the whole file into memory

.. code-block:: python

>>> from cigsegy import SegyNP
>>> d = SegyNP('rogan.sgy', iline=9, xline=21)
>>> d.shape # (ni, nx, nt), use as a numpy array, 3D geometry
>> sx = d[100] # the 100-th inline profile
>> sx = d[100:200] # return a 3D array with shape (100, nx, nt)
>> sx = d[:, 200, :] # the 200-th crossline profile
>> sx = d[:, :, 100] # the 100-th time slice, note, it may be slow if the file is large
>> sx.min(), sx.max()
# get the min and max value, but they are evaluated from a part of data,
# so they may not be the real min and max value
>> sx.trace_cout # get the number of traces for the file



License
=======

Expand All @@ -151,6 +164,29 @@ TODO
- Add convenient function to support **unsorted** prestack gathers.


Citations
===========
If you find this work useful in your research and want to cite it, please consider use this:

Plain Text

.. code-block:: python

Li, Jintao. "CIGSEGY: A tool for exchanging data between SEG-Y format and NumPy array inside Python environment". URL: https://github.com/JintaoLee-Roger/cigsegy


BibTex

.. code-block:: latex

@misc{cigsegy,
author = {Li, Jintao},
title = {CIGSEGY: A tool for exchanging data between SEG-Y format and NumPy array inside Python environment},
url = {\url{https://github.com/JintaoLee-Roger/cigsegy}},
}



=========

.. [1] Here **irregular** SEG-Y volume means the area covered by a SEG-Y file is not a rectangle but a polygon (meaning that some lines are missing some traces), or its inline/crossline intervals are not 1.
42 changes: 2 additions & 40 deletions README_ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
</tr>
</table>

## 中文版的README不是最新的, 只支持到1.1.5, 详情请参考英文版

**此文件最后更新版本为 1.1.5**

一个读写 `segy` 格式地震数据的 `python``c++` 工具。可以将 `segy` 格式文件读到内存或者直接转为二进制文件,也可以将一个 `numpy` 数据存储为`segy`格式的文件。
Expand Down Expand Up @@ -46,7 +48,6 @@ cigsegy.create_by_sharing_header('out.segy', 'header.segy', d, iline=, xline=, i
- [用到的第三方依赖](#ThirdPart)
- [两个在终端运行的可执行文件](#Executables)
- [局限性](#Limitations)
- [对比](#Comparison)
- [感谢](#Acknowledge)

<p id="Installation"></p>
Expand Down Expand Up @@ -360,45 +361,6 @@ segy.create('new_fx.segy', d)
### 局限性

- 只测试过叠后数据
- 只支持 4 字节的 IBM 浮点数和 4 字节的 IEEE 浮点数


<p id="Comparison"></p>

### 对比

`segysak`相比,我们的实现速度更快(`segysak`是纯`Python`实现)。

在读取segy文件时,`cigsegy``segyio`稍慢,但差距很小。但是,`cigsegy`在创建`segy`文件时比`segyio`快。

`segyio`假设文件是一个有序的三维数据集。它也支持仅由一系列道构成的文件(非严格模式),但在这种模式下,许多功能都被禁用,并会引发错误。然而,有许多segy文件是有序的,但其中缺少一些道。虽然这些文件很容易处理,
`segyio`不支持这些文件。`cigsegy`支持这些文件,只需使用相同的方法,例如`cigsegy.fromfile('miss.segy')`。此外,`cigsegy`还可以处理 inline 和 crossline 间隔不为1的文件。


出于某些原因(保密要求、记录错误等),文件头可能损坏。如果您记得体积大小和采样格式(IBM为1,IEEE为5),
`cigsegy` 也可以读取这些文件. 只需要使用
```python
d = cigsegy.fromfile_ignore_header('miss.segy', inline_size, crossline_size, time, dformat)
```

`segymat` 是在MATLAB中实现的,它的运行速度非常慢。

`shape = (651, 951, 462) 1.3G`
|mode|cigsegy|segyio|segysak|segymat|
|---|---|---|---|---|
|read|1.212s|0.944s|134.8s|151.99s|
|create|3.01s|14.03s|-|-|



对于 `cigsegy` and `segyio` 运行 `read` 3 次,
`segysak` 只运行一次,
`(1062, 2005, 2401) 20G`
|mode|cigsegy|segyio|segysak|segymat|
|---|---|---|---|---|
|read|45.6s+45.5s+45s|65s+20s+19s|612.45s|>1500s|
|create|41.14s+43.35s|78.74s+82.74s|-|-|


<p id="Acknowledge"></p>

Expand Down
29 changes: 18 additions & 11 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ if [[ "$OSTYPE" == "linux-gnu"* ]]; then
# fi
# done

cp -r /share/thridpart/fmt_2_24/* /usr/local/
cp -r /share/thridpart/fmt_8/* /usr/local/
rm -rf /share/build
mkdir /share/build
rm -rf /share/wheels
Expand All @@ -23,7 +23,7 @@ if [[ "$OSTYPE" == "linux-gnu"* ]]; then
do
version="${cpyroot#/opt/_internal/cpython-}"
version_min="${version%.*}"
if [ "$version_min" != "3.5" ] ; then
if [ "$version_min" != "3.13" ] ; then
cpy=$cpyroot/bin/python3
cpylib=$cpyroot/lib
pypi=$cpyroot/bin/pip3
Expand All @@ -36,7 +36,7 @@ if [[ "$OSTYPE" == "linux-gnu"* ]]; then
$pypi wheel .
whl=`ls *.whl`
prefix="${whl%linux_x86_64.whl}"
cp $whl /share/wheels/${prefix}manylinux_2_24_x86_64.whl
cp $whl /share/wheels/${prefix}manylinux2014_x86_64.whl
# auditwheel repair $whl -w /share/wheelhouse/
cd /share/build
rm CMakeCache.txt
Expand All @@ -55,19 +55,26 @@ if [[ "$OSTYPE" == "linux-gnu"* ]]; then
# done


rm -rf /share/thridpart/cigsegy/cigsegy*
rm -rf /share/thridpart/cigsegy/dist
rm -rf /share/thridpart/cigsegy/src
rm -rf /share/thridpart/cigsegy/python

rm -rf /share/thridpart/cigsegy
mkdir /share/thridpart/cigsegy

cp -r /share/src /share/thridpart/cigsegy/
cp -r /share/python /share/thridpart/cigsegy/
cp -r /share/setup.py /share/thridpart/cigsegy/

cp -r /share/README.rst /share/thridpart/cigsegy/
cp -r /share/thridpart/fmt_8/include /share/thridpart/cigsegy/
cp /share/LICENSE /share/thridpart/cigsegy/

echo "include LICENSE
recursive-include python *.pyi *.py *.cpp
recursive-include src *.h *.hpp *.cpp
recursive-include include *.h *.hpp" > /share/thridpart/cigsegy/MANIFEST.in

sed -i "s/fmt_root = ''/fmt_root = '.'/" /share/thridpart/cigsegy/setup.py

sed -i "s/'numpy'/'numpy', 'pybind11'/g" /share/thridpart/cigsegy/setup.py

cd /share/thridpart/cigsegy/
python=/opt/_internal/cpython-3.9.15/bin/python3
python=/opt/_internal/cpython-3.11.9/bin/python3
${python} setup.py sdist
mv /share/thridpart/cigsegy/dist/*.tar.gz /share/wheels/

Expand Down
13 changes: 13 additions & 0 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,19 @@
Changelog
#########

v1.1.7
--------

- Refine the ``scan`` function to support more situations.
- Add supports for dealing with more data sample formats, such as 4-byte, two's complement integer.
- Add a new class ``SegyNP`` to simulate the segy file accessed as a numpy array.
- Add functions: ``scan_unsorted3D`` and ``load_unsorted3D`` to support 3D unsorted data.
- Remove the comparison part of the documents, as ``segysak`` has a large update.
- ``use_guess`` in the functions like ``metaInfo`` has been deseperated.
- Added more atomic operations, enabling finer control of SEG-Y files
- And more ...


v1.1.6
-------

Expand Down
58 changes: 0 additions & 58 deletions docs/comparison.rst

This file was deleted.

45 changes: 45 additions & 0 deletions docs/core/SegyNP.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
SegyNP class
###################

This class can help you treat SEG-Y files as 3D/2D numpy arrays, without reading the whole file into memory.
This means the reading process only occurs when you access the data, such as a part of data.


3D array
==========

If the SEG-Y file is a 3D post-stack seismic file:

.. code-block:: python

>>> from cigsegy import SegyNP
>>> d = SegyNP('3Dpoststack.sgy', iline=189, xline=193)
>>> d.shape # (ni, nx, nt), use as a numpy array, 3D geometry
>> sx = d[100] # the 100-th inline profile
>> sx = d[100:200] # return a 3D array with shape (100, nx, nt)
>> sx = d[:, 200, :] # the 200-th crossline profile
>> sx = d[:, :, 100] # the 100-th time slice, note, it may be slow if the file is large
>> sx.min(), sx.max()
# get the min and max value, but they are evaluated from a part of data,
# so they may not be the real min and max value
>> sx.trace_cout # get the number of traces for the file
>> sx.close() # close the file


2D array
==========

If you don't want to create a 3D geometry, just treat the SEG-Y file as a collection of
1D traces, i.e., a 2D array, you can use the following code:

.. code-block:: python

>>> from cigsegy import SegyNP
>>> d = SegyNP('2Dseismic.sgy', as_2d=True)
>>> d.shape # (trace_count, nt), use as a numpy array, collection of 1D traces
>> sx = d[100] # the 100-th trace
>> sx = d[100:200] # return a 2D array with shape (100, nt)
>> sx.min(), sx.max()
# get the min and max value, but they are evaluated from a part of data,
# so they may not be the real min and max value
>> sx.trace_cout # the number of traces for the file
Loading

0 comments on commit 78adb30

Please sign in to comment.