classix.CLASSIX.explain

CLASSIX.explain(index1=None, index2=None, cmap='jet', showalldata=False, showallgroups=False, showsplist=False, max_colwidth=None, replace_name=None, plot=False, figsize=(10, 7), figstyle='default', savefig=False, bcolor='#f5f9f9', obj_color='k', width=1.5, obj_msize=160, sp1_color='lime', sp2_color='cyan', sp_fcolor='tomato', sp_marker='+', sp_size=72, sp_mcolor='k', sp_alpha=0.05, sp_pad=0.5, sp_fontsize=10, sp_bbox=None, sp_cmarker='+', sp_csize=110, sp_ccolor='crimson', sp_clinewidths=2.7, dp_fcolor='white', dp_alpha=0.5, dp_pad=2, dp_fontsize=10, dp_bbox=None, show_all_grp_circle=False, show_connected_grp_circle=False, show_obj_grp_circle=True, color='red', connect_color='green', alpha=0.3, cline_width=2, add_arrow=True, arrow_linestyle='--', arrow_fc='darkslategrey', arrow_ec='k', arrow_linewidth=1, arrow_shrinkA=2, arrow_shrinkB=2, directed_arrow=0, axis='off', include_dist=False, show_connected_label=True, figname=None, fmt='pdf')[source]

‘self.explain(object/index) # prints an explanation for why a point object1 is in its cluster (or an outlier) ‘self.explain(object1/index1, object2/index2) # prints an explanation why object1 and object2 are either in the same or distinct clusters

Here we unify the terminology:

[-] data points [-] groups (made up of data points, formed by aggregation) [-] clusters (made up of groups)

Parameters

index1int or numpy.ndarray, optional

Input object1 [with index ‘index1’] for explanation.

index2int or numpy.ndarray, optional

Input object2 [with index ‘index2’] for explanation, and compare objects [with indices ‘index1’ and ‘index2’].

cmapstr, default=’Set3’

Colormaps for scatter plot.

showalldataboolean, default=False

Whether or not to show all data points in global view when too many data points for plot.

showallgroupsboolean, default=False

Whether or not to show the start points marker.

showsplistboolean, default=False

Whether or not to show the group centers information, which include the number of data points (NumPts), corresponding clusters, and associated coordinates. This only applies to both index1 and index2 are “NULL”. Default as True.

max_colwidthint, optional

Max width to truncate each column in characters. By default, no limit.

replace_namestr or list, optional

Replace the index with name. * For example: as for indices 1 and 1300 we have

classix.explain(1, 1300, plot=False, figstyle="seaborn") # or classix.explain(obj1, obj4)

The data point 1 is in group 9 and the data point 1300 is in group 8, both of which were merged into cluster #0. The two groups are connected via groups 9 -> 2 -> 8. * if we specify the replace name, then the output will be

classix.explain(1, 1300, replace_name=["Peter Meyer", "Anna Fields"], figstyle="seaborn")

The data point Peter Meyer is in group 9 and the data point Anna Fields is in group 8, both of which were merged into cluster #0. The two groups are connected via groups 9 -> 2 -> 8.

plotboolean, default=False

Determine if visulize the explanation.

figsizetuple, default=(9, 6)

Determine the size of explain figure.

figstylestr, default=”default”

Determine the style of visualization. see reference: https://matplotlib.org/stable/gallery/style_sheets/style_sheets_reference.html

savefigboolean, default=False

Determine if save figure, the figure will be saved in the folder named “img”.

bcolorstr, default=”#f5f9f9”

Color for figure background.

obj_colorstr, default as “k”

Color for the text of data of index1 and index2.

obj_msizefloat, optional:

Size for markers for data of index1 and index2.

sp_fcolorstr, default=’tomato’

The color marked for group centers text box.

sp_markerstr, default=”+”

The marker for the start points.

sp_sizeint, default=66

The marker size for the start points.

sp_mcolorstr, default=’k’

The color marked for startpoint points scatter marker.

sp_alphafloat, default=0.3

The value setting for transparency of text box for group centers.

sp_padint, default=2

The size of text box for group centers.

sp_bboxdict, optional

Dict with properties for patches.FancyBboxPatch for group centers.

sp_fontsizeint, optional

The fontsize for text marked for group centers.

sp_cmarkerstr, default=”+”

The marker for the connected group centers.

sp_csizeint, default=100

The marker size for the connected group centers.

sp_ccolorstr, default=”crimson”

The marker color for the connected group centers.

sp_clinewidthsstr, default=2.5

The marker width for the connected group centers.

dp_fcolorstr, default=’white’

The color marked for specified data objects text box.

dp_alphafloat, default=0.5

The value setting for transparency of text box for specified data objects.

dp_padint, default=2

The size of text box for specified data objects.

dp_fontsizeint, optional

The fontsize for text marked for specified data objects.

dp_bboxdict, optional

Dict with properties for patches.FancyBboxPatch for specified data objects.

show_all_grp_circlebool, default=False

Whether or not to show all groups’ periphery within the objects’ clusters (only applies to when data dimension is less than or equal to 2).

show_connected_grp_circlebool, default=False

Whether or not to show all connected groups’ periphery within the objects’ clusters (only applies to when data dimension is less than or equal to 2).

show_obj_grp_circlebool, default=True

Whether or not to show the groups’ periphery of the objects (only applies to when data dimension is less than or equal to 2).

colorstr, default=’red’

Color for text of group centers labels in visualization.

alphafloat, default=0.3

Transparency of data points. Scalar or None.

cline_widthfloat, default=2

Set the patch linewidth of circle for group centers.

add_arrowbool, default=False

Whether or not add arrows for connected paths.

arrow_linestylestr, default=’–’

Linestyle for arrow.

arrow_fcstr, default=’darkslategrey’

Face color for arrow.

arrow_ecstr, default=’k’

Edge color for arrow.

arrow_linewidthfloat, default=1

Set the linewidth of the arrow edges.

directed_arrowint, default=0

Whether or not the edges for arrows is directed. Values at {-1, 0, 1}, 0 refers to undirected, -1 refers to the edge direction opposite to 1.

shrinkA, shrinkBfloat, default=2

Shrinking factor of the tail and head of the arrow respectively.

axisboolean, default=True

Whether or not add x,y axes to plot.

include_distboolean, default=False

Whether or not to include distance information to compute the shortest path between objects.

show_connected_labelboolean, default=True

Whether or not to show the named labels of the connected data points, where the named labels are given by pandas dataframe index.

fignamestr, optional

Set the figure name for the image to be saved.

fmtstr

Specify the format of the image to be saved, default as ‘pdf’, other choice: png.