Sindbad~EG File Manager
�
Mٜgm � � � d dl mZ d dlmZmZ d dlZd dlmZ d dlm Z d dl
mZ d dlm
Z d dlmZ e rd d lmZmZmZmZ d d
lmZmZ d dlmZmZmZ ed� Z ed
� Z ed� Z ed� Z ddeeee dd�Z! ed� Z" ed� Z#dddee"e#dd�Z$ ed� Z%d9d�Z&d:d�Z' d; d<d�Z( G d� de� Z) G d� de)� Z* G d � d!e)� Z+ G d"� d#� Z, G d$� d%e,� Z- G d&� d'e,� Z. G d(� d)e� Z/ G d*� d+e/� Z0 G d,� d-e0� Z1 G d.� d/e/� Z2 G d0� d1e0e2� Z3 G d2� d3e/� Z4 G d4� d5e4� Z5 G d6� d7e4e2� Z6d=d8�Z7y)>� )�annotations)�ABC�abstractmethodN)�dedent)�
TYPE_CHECKING��
get_option)�format)�pprint_thing)�Iterable�Iterator�Mapping�Sequence)�Dtype�WriteBuffer)� DataFrame�Index�Seriesa max_cols : int, optional
When to switch from the verbose to the truncated output. If the
DataFrame has more than `max_cols` columns, the truncated output
is used. By default, the setting in
``pandas.options.display.max_info_columns`` is used.aR show_counts : bool, optional
Whether to show the non-null counts. By default, this is shown
only if the DataFrame is smaller than
``pandas.options.display.max_info_rows`` and
``pandas.options.display.max_info_columns``. A value of True always
shows the counts, and False never shows the counts.a� >>> int_values = [1, 2, 3, 4, 5]
>>> text_values = ['alpha', 'beta', 'gamma', 'delta', 'epsilon']
>>> float_values = [0.0, 0.25, 0.5, 0.75, 1.0]
>>> df = pd.DataFrame({"int_col": int_values, "text_col": text_values,
... "float_col": float_values})
>>> df
int_col text_col float_col
0 1 alpha 0.00
1 2 beta 0.25
2 3 gamma 0.50
3 4 delta 0.75
4 5 epsilon 1.00
Prints information of all columns:
>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 int_col 5 non-null int64
1 text_col 5 non-null object
2 float_col 5 non-null float64
dtypes: float64(1), int64(1), object(1)
memory usage: 248.0+ bytes
Prints a summary of columns count and its dtypes but not per column
information:
>>> df.info(verbose=False)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Columns: 3 entries, int_col to float_col
dtypes: float64(1), int64(1), object(1)
memory usage: 248.0+ bytes
Pipe output of DataFrame.info to buffer instead of sys.stdout, get
buffer content and writes to a text file:
>>> import io
>>> buffer = io.StringIO()
>>> df.info(buf=buffer)
>>> s = buffer.getvalue()
>>> with open("df_info.txt", "w",
... encoding="utf-8") as f: # doctest: +SKIP
... f.write(s)
260
The `memory_usage` parameter allows deep introspection mode, specially
useful for big DataFrames and fine-tune memory optimization:
>>> random_strings_array = np.random.choice(['a', 'b', 'c'], 10 ** 6)
>>> df = pd.DataFrame({
... 'column_1': np.random.choice(['a', 'b', 'c'], 10 ** 6),
... 'column_2': np.random.choice(['a', 'b', 'c'], 10 ** 6),
... 'column_3': np.random.choice(['a', 'b', 'c'], 10 ** 6)
... })
>>> df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000000 entries, 0 to 999999
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 column_1 1000000 non-null object
1 column_2 1000000 non-null object
2 column_3 1000000 non-null object
dtypes: object(3)
memory usage: 22.9+ MB
>>> df.info(memory_usage='deep')
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000000 entries, 0 to 999999
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 column_1 1000000 non-null object
1 column_2 1000000 non-null object
2 column_3 1000000 non-null object
dtypes: object(3)
memory usage: 165.9 MBz� DataFrame.describe: Generate descriptive statistics of DataFrame
columns.
DataFrame.memory_usage: Memory usage of DataFrame columns.r z and columns� )�klass�type_sub�max_cols_sub�show_counts_sub�examples_sub�see_also_sub�version_added_suba� >>> int_values = [1, 2, 3, 4, 5]
>>> text_values = ['alpha', 'beta', 'gamma', 'delta', 'epsilon']
>>> s = pd.Series(text_values, index=int_values)
>>> s.info()
<class 'pandas.core.series.Series'>
Index: 5 entries, 1 to 5
Series name: None
Non-Null Count Dtype
-------------- -----
5 non-null object
dtypes: object(1)
memory usage: 80.0+ bytes
Prints a summary excluding information about its values:
>>> s.info(verbose=False)
<class 'pandas.core.series.Series'>
Index: 5 entries, 1 to 5
dtypes: object(1)
memory usage: 80.0+ bytes
Pipe output of Series.info to buffer instead of sys.stdout, get
buffer content and writes to a text file:
>>> import io
>>> buffer = io.StringIO()
>>> s.info(buf=buffer)
>>> s = buffer.getvalue()
>>> with open("df_info.txt", "w",
... encoding="utf-8") as f: # doctest: +SKIP
... f.write(s)
260
The `memory_usage` parameter allows deep introspection mode, specially
useful for big Series and fine-tune memory optimization:
>>> random_strings_array = np.random.choice(['a', 'b', 'c'], 10 ** 6)
>>> s = pd.Series(np.random.choice(['a', 'b', 'c'], 10 ** 6))
>>> s.info()
<class 'pandas.core.series.Series'>
RangeIndex: 1000000 entries, 0 to 999999
Series name: None
Non-Null Count Dtype
-------------- -----
1000000 non-null object
dtypes: object(1)
memory usage: 7.6+ MB
>>> s.info(memory_usage='deep')
<class 'pandas.core.series.Series'>
RangeIndex: 1000000 entries, 0 to 999999
Series name: None
Non-Null Count Dtype
-------------- -----
1000000 non-null object
dtypes: object(1)
memory usage: 55.3 MBzp Series.describe: Generate descriptive statistics of Series.
Series.memory_usage: Memory usage of Series.r z
.. versionadded:: 1.4.0
a�
Print a concise summary of a {klass}.
This method prints information about a {klass} including
the index dtype{type_sub}, non-null values and memory usage.
{version_added_sub}
Parameters
----------
verbose : bool, optional
Whether to print the full summary. By default, the setting in
``pandas.options.display.max_info_columns`` is followed.
buf : writable buffer, defaults to sys.stdout
Where to send the output. By default, the output is printed to
sys.stdout. Pass a writable buffer if you need to further process
the output.
{max_cols_sub}
memory_usage : bool, str, optional
Specifies whether total memory usage of the {klass}
elements (including the index) should be displayed. By default,
this follows the ``pandas.options.display.memory_usage`` setting.
True always show memory usage. False never shows memory usage.
A value of 'deep' is equivalent to "True with deep introspection".
Memory usage is shown in human-readable units (base-2
representation). Without deep introspection a memory estimation is
made based in column dtype and number of rows assuming values
consume the same memory amount for corresponding dtypes. With deep
memory introspection, a real memory usage calculation is performed
at the cost of computational resources. See the
:ref:`Frequently Asked Questions <df-memory-usage>` for more
details.
{show_counts_sub}
Returns
-------
None
This method prints a summary of a {klass} and returns None.
See Also
--------
{see_also_sub}
Examples
--------
{examples_sub}
c �<