コンテンツメニュー
Publish Date (<span class="translation_missing" title="translation missing: en.view.asc">Asc</span>)
published_at 2011-03
Creators : 谷合 由章 Dissertation Number : 理工博甲第534号 Degree Names : 博士(理学) Date Granted : 2011-03-16 Degree Grantors : Yamaguchi University
published_at 2011
Creators : Kobayashi Koichiro Dissertation Number : 理工博甲第537号 Degree Names : 博士(理学) Date Granted : 2011-03-16 Degree Grantors : Yamaguchi University
published_at 2011
Creators : Sugiyama Koichiro Dissertation Number : 理工博甲第538号 Degree Names : 博士(理学) Date Granted : 2011-03-16 Degree Grantors : Yamaguchi University
published_at 2010
Creators : 片山 直樹 Dissertation Number : 理工博甲第529号 Degree Names : 博士(理学) Date Granted : 2010-09-27 Degree Grantors : Yamaguchi University
published_at 2011
Creators : Miyoshi Tatsuki Dissertation Number : 理工博甲第553号 Degree Names : 博士(理学) Date Granted : 2011-05-18 Degree Grantors : Yamaguchi University
published_at 2011
Creators : 越地 尚宏 Dissertation Number : 理工博甲第554号 Degree Names : 博士(理学) Date Granted : 2011-06-15 Degree Grantors : Yamaguchi University
published_at 2012
Creators : 石田 麻里 Dissertation Number : 理工博甲第571号 Degree Names : 博士(理学) Date Granted : 2012-03-05 Degree Grantors : Yamaguchi University
published_at 2012
Creators : 相山 光太郎 Dissertation Number : 理工博甲第575号 Degree Names : 博士(理学) Date Granted : 2012-03-16 Degree Grantors : Yamaguchi University
published_at 2012
Creators : 竹中 佑美 Dissertation Number : 理工博甲第576号 Degree Names : 博士(理学) Date Granted : 2012-03-16 Degree Grantors : Yamaguchi University
published_at 2012
Creators : Tokuyasu Kayoko Dissertation Number : 理工博甲第577号 Degree Names : 博士(理学) Date Granted : 2012-03-16 Degree Grantors : Yamaguchi University
published_at 2012
Creators : Kamitomo Itsuki Dissertation Number : 理工博甲第578号 Degree Names : 博士(理学) Date Granted : 2012-03-26 Degree Grantors : Yamaguchi University
published_at 2013
Creators : 呉 靭 Dissertation Number : 理工博乙第124号 Degree Names : 博士(理学) Date Granted : 2013-08-09 Degree Grantors : Yamaguchi University
published_at 2014
Creators : 田 忠原 Dissertation Number : 理工博甲第620号 Degree Names : 博士(理学) Date Granted : 2014-03-17 Degree Grantors : Yamaguchi University
published_at 2002
Creators : Tanaka Hiroshi Dissertation Number : 理工博甲第259号 Degree Names : 博士(理学) Date Granted : 2002-09-30 Degree Grantors : Yamaguchi University
published_at 2015
Creators : 中川 知之 Dissertation Number : 理工博甲第650号 Degree Names : 博士(理学) Date Granted : 2015-03-16 Degree Grantors : Yamaguchi University
published_at 2015
Creators : 林 信雄 Dissertation Number : 理工博甲第652号 Degree Names : 博士(理学) Date Granted : 2015-03-16 Degree Grantors : Yamaguchi University
published_at 2015
Creators : 小島 萌 Dissertation Number : 理工博甲第653号 Degree Names : 博士(理学) Date Granted : 2015-03-16 Degree Grantors : Yamaguchi University
published_at 2015
Creators : 垣内田 翔子 Dissertation Number : 理工博甲第671号 Degree Names : 博士(理学) Date Granted : 2015-05-29 Degree Grantors : Yamaguchi University
published_at 2016
Creators : 古川 翔大 Dissertation Number : 理工博甲第682号 Degree Names : 博士(理学) Date Granted : 2016-03-17 Degree Grantors : Yamaguchi University
published_at 2016
Creators : 村上 裕晃 Dissertation Number : 理工博甲第683号 Degree Names : 博士(理学) Date Granted : 2016-03-17 Degree Grantors : Yamaguchi University
published_at 2013
Creators : Akasaki Eri Dissertation Number : 理工博甲第603号 Degree Names : 博士(理学) Date Granted : 2013-03-18 Degree Grantors : Yamaguchi University
published_at 2016
Creators : Laila Fitriana Dissertation Number : 理工博甲第699号 Degree Names : 博士(理学) Date Granted : 2016-09-30 Degree Grantors : Yamaguchi University
published_at 2017
Creators : 橋爪 善光 Dissertation Number : 理工博乙第139号 Degree Names : 博士(理学) Date Granted : 2017-02-08 Degree Grantors : Yamaguchi University
published_at 2017
Creators : 國安 正志 Dissertation Number : 理工博甲第704号 Degree Names : 博士(理学) Date Granted : 2017-03-16 Degree Grantors : Yamaguchi University
published_at 2017
Creators : 上田 千晶 Dissertation Number : 理工博甲第705号 Degree Names : 博士(理学) Date Granted : 2017-03-16 Degree Grantors : Yamaguchi University
published_at 2017
Creators : 松岡 丈平 Dissertation Number : 理工博甲第706号 Degree Names : 博士(理学) Date Granted : 2017-03-16 Degree Grantors : Yamaguchi University
published_at 2017
Creators : Nutjaree Charoenbunwanon Dissertation Number : 理工博甲第707号 Degree Names : 博士(理学) Date Granted : 2017-03-16 Degree Grantors : Yamaguchi University
published_at 2017
Creators : 中村 一平 Dissertation Number : 理工博甲第708号 Degree Names : 博士(理学) Date Granted : 2017-03-16 Degree Grantors : Yamaguchi University
published_at 2018
Creators : 大神 隆幸 Dissertation Number : 理工博甲第733号 Degree Names : 博士(理学) Date Granted : 2018-03-16 Degree Grantors : Yamaguchi University
published_at 2018
Creators : 岡本 哲 Dissertation Number : 理工博甲第734号 Degree Names : 博士(理学) Date Granted : 2018-03-16 Degree Grantors : Yamaguchi University
published_at 2018
Creators : Nguyen Trong Kuong Dissertation Number : 理工博甲第754号 Degree Names : 博士(理学) Date Granted : 2018-07-31 Degree Grantors : Yamaguchi University
published_at 2018
Creators : 西山 尚登 Dissertation Number : 理工博甲第755号 Degree Names : 博士(理学) Date Granted : 2018-09-27 Degree Grantors : Yamaguchi University
published_at 2019
Creators : 植田 祥明 Dissertation Number : 創科博甲第7号 Degree Names : 博士(理学) Date Granted : 2019-03-18 Degree Grantors : Yamaguchi University
published_at 2019
Creators : 藤林 将 Dissertation Number : 創科博甲第8号 Degree Names : 博士(理学) Date Granted : 2019-03-18 Degree Grantors : Yamaguchi University
published_at 2019
Creators : 熱田 真一 Dissertation Number : 創科博甲第9号 Degree Names : 博士(理学) Date Granted : 2019-03-18 Degree Grantors : Yamaguchi University
published_at 2020
Creators : 田代 啓悟 Dissertation Number : 創科博甲第23号 Degree Names : 博士(理学) Date Granted : 2020-03-16 Degree Grantors : Yamaguchi University
published_at 2020
Creators : 家鋪 真衣 Dissertation Number : 創科博甲第24号 Degree Names : 博士(理学) Date Granted : 2020-03-16 Degree Grantors : Yamaguchi University
published_at 2020
Creators : 中川 孝典 Dissertation Number : 創科博甲第25号 Degree Names : 博士(理学) Date Granted : 2020-03-16 Degree Grantors : Yamaguchi University
published_at 2021
Creators : 須内 寿男 Dissertation Number : 創科博乙第3号 Degree Names : 博士(理学) Date Granted : 2021-03-04 Degree Grantors : Yamaguchi University
published_at 2021
Creators : 児玉 省吾 Dissertation Number : 創科博甲第46号 Degree Names : 博士(理学) Date Granted : 2021-03-16 Degree Grantors : Yamaguchi University
Creators : 江島 圭祐 Dissertation Number : 創科博甲第74号 Degree Names : 博士(理学) Date Granted : 2022-03-16 Degree Grantors : Yamaguchi University
Creators : 柴田 義大 Dissertation Number : 創科博甲第75号 Degree Names : 博士(理学) Date Granted : 2022-03-16 Degree Grantors : Yamaguchi University
Creators : Tanaka Shohei Dissertation Number : 創科博甲第76号 Degree Names : 博士(理学) Date Granted : 2022-03-16 Degree Grantors : Yamaguchi University
In high strain-rate zones, active regions of ongoing crustal deformation, earthquakes occur frequently, the total slip rates of active faults are in the zone not consistent to strain rate detected by geodesy. This difference is one of the most significant issues for crustal deformation, and is known as "strain-rate paradox". Previous crustal deformation models are mainly constructed with major active faults alone, whereas minor faults are often recognized in the high strain-rate zones. The aims of this thesis are to solve the strain-rate paradox and propose a new image of the crustal deformation by focusing on the minor faults. In order to accomplish these goals, the representative high strain-rate zones such as San-in Shear Zone (SSZ) and Niigata-Kobe Tectonic Zone (NKTZ) were targeted. As a result of the topographical and geological approaches, universal model, origin, deformation process and mechanism of the high strain-rate zone were clarified. The main outcomes are as follows: (1) Minor faults in the NKTZ, which are mostly NE-SW to ENE-WSW-trending, have a few mm to a few dozens of cm in width and exhibit dextral sense of shear. These minor faults are distributed in the vicinity and/or away from the major active faults. In addition, the active fault, whose core zone has 5 m in thickness, were found. Such fault showed dextral sense of shear and has the latest slip event after AD 1521-1658, suggesting that the fault clearly contribute to the dextral deformation of the NKTZ. The origins of such faults are thought to be tensile cracks formed in Cretaceous, suggesting that the faults contribute to the dextral deformation of the NKTZ after repeated faulting along the cracks. The minor faults away from the major active faults are also thought to contribute to the deformation of the NKTZ, whereas minor faults outside of the NKTZ cannot contribute to that of the NKTZ. (2) Minor faults in the SSZ, which are mostly ENE-WSW to NE-SW-trending, have a few mm to a few dozens of cm in width and exhibit dextral sense of shear. These other minor faults trending NW-SE ~ NNW-SSE direction with steep dips, are sinistral sense of shear. The minor faults, which is trending E-W direction with steep dips, showed dextral sense of shear. Active faults, whose attitude are nearly parallel to the SSZ, are also newly recognized. The thickness of the fault is a few cm and thought to show dextral-reverse oblique slip after 18648-16313 cal. BC. The frameworks of the major active faults in the SSZ are thought to prepared along the geological boundaries and such faults have grown by the repeated activities since Paleogene. It is considered that not only major active faults but also minor faults away from the major active faults can contribute to the dextral motion of the SSZ. On the contrary, there are only reverse fault was recognized in the area outside of the SSZ. (3) The minor fault in the high strain-rate zones, which includes the minor fault away from the major active fault, can contribute to the dextral deformation of the high strain-rate zones because of their attitudes and sense of shears. On the other hand, the minor faults outside of the high strain-rate zones cannot contribute to the dextral deformation of the zone due to their attitudes and sense of shears. Thus, there are noteworthy difference between minor faults in and outside of the high strain-rate zones. Combining these outcomes, a hierarchical structure of the high strain-rate zones can be constructed as follows: (I) fault core of major active faults, (II) damage zone of major active faults, (III) brittle shear zone (or active background; the area beyond the damage zone but in the SSZ), (IV) inactive background (outside of the high strain-rate zone). This new model enables to partly solve the strain rate paradox for both zones, whereas an occurrence of faults differs between the zones. The NKTZ is characterized by NE-SW to ENE-WSW-trending minor faults and their thickness ranging from a few mm to a few dozens of cm. The active faults possess fault core with > 5 m in thickness. The SSZ are characterized by NW-SE or E-W-trending minor faults and their thickness ranging from a few mm to a few dozens of cm. Some faults show the Quaternary activities, whereas fault core with a few meters in thickness were not found. These differences on fault occurrence are considered to be derived from the evolutional processes. It is thought that the repeated activities along the pre-existed structures lead to present active faults. Thus, it can be considered that the faults are assigned in response to the local geological background, which result in dextral contribution to the high strain-rate zones. This study clarified universal model, origin, deformation process and mechanism of the high strain-rate zone by focusing on the minor faults. These achievements can constrain the modeling of the crustal deformation and interpretations of the geodetical observations and can contribute to assessments of large-scale constructions and seismic hazards.
Creators : Tamura Tomonori Dissertation Number : 創科博甲第77号 Degree Names : 博士(理学) Date Granted : 2022-03-16 Degree Grantors : Yamaguchi University
中央構造線(MTL; Median Tectonic Line)は、西南日本を東西に横断する延長約1000kmの断層である。愛媛県西条市付近には、MTLは三波川変成帯と和泉層群を境する構造線としての低角度な断層帯(MTLTB; MTL inactive terrane boundary)と、この断層の北側に並走する活断層としての高角度な断層帯(MTLAFZ; MTL active fault zone)がある。地表でのMTLAFZの傾斜角度を明らかにするために、川上断層を横断する延長約10m、深さ約2mのトレンチ調査を行った。また、地表部で約10mの間隔で並走する両断層の地下での接合関係と断層面の傾斜角度を明らかにするために、断層を横断する80-330mの6本のボーリング掘削を実施した。更に、より広範囲の断層構造や地盤の物性を把握するために延長1200mの反射法地震探査と延長500mの高密度電気探査を実施した。採取した断層試料を用いて断層岩の化学分析、変形構造記載、カルサイトの双晶密度の測定、断層の変形フェーズの解析を行い、低角度横ずれ断層のメカニズムや断層活動史を明らかにした。 トレンチ調査、ボーリング調査、高密度電気探査により、地表部で北方へ約70゜の角度で傾斜する川上断層が、地下で北方へ30゜の角度で傾斜するMTLTBに収れんすることが示唆され、地下のMTLTBは活断層であることが分かった。MTLTBの上盤に分布する小断層の卓越した和泉層群の比抵抗値は、主破砕帯の割れ目の少ない安山岩ブロックと推定される高比抵抗部を除き、断層下盤に分布する堅硬な三波川変成岩類の比抵抗値よりも低い値を示した。また、断層に沿って深部流体が上昇していると推定される低比抵抗帯が確認された。反射法地震探査では、MTLTBに相当する北方へ約30゜の角度で傾斜する明瞭な反射面が確認され、より深部まで断層が延長することが分かった。主破砕帯を構成する蛇紋岩中の鉱物のEPMA分析結果によると、マントル起源のマグネシオクロマイトを含むことが分かった。既往の深部地震探査の結果は、MTLの深部延長が下部地殻まで達しいることを示しているが、これにより、MTLTBの延長がマントルまで達し、蛇紋岩が断層変位とダイアピルによって表層部まで上昇してきたことが示唆された。MTLTBは断層面の傾斜角度が低角度であり、本来は横ずれ断層として動きにくいと考えられる。MTLTBの断層ガウジや主破砕帯に大量の層状珪酸塩鉱物が存在することや断層沿いの深部流体の存在は、断層のせん断強度を低下させる要因となり、低角度の断層でも横ずれ運動が可能になったと考えられる。カルサイトの双晶密度から求めたMTLTBを横断する歪み分布は断層から直線的で緩やかに低下する傾向を示し、断層のせん断強度が低下していることを示唆する。 変形フェーズの解析では、MTLTBとMTLAFZの幾何学的な特徴やそれぞれの断層と地層との接合関係、断層の変位センス等の構造地質学的特徴、古応力場の解析等に基づいて変形フェーズを古いほうからD1~D4の4つに定義した。D1フェーズはNNE-SSW圧縮の応力場の変形であり始新世中期(47 -46 Ma) 頃に断層の上盤が西方へ変位した左横ずれセンスの運動、D2フェーズはE-W伸張の応力場の変形であり中新世中期(15 -14 Ma) 頃に断層の上盤が北方へ変位した正断層センスの運動、D3フェーズはNNW-SSE圧縮の応力場の変形であり中新世中期から鮮新世後期(14-3Ma) 頃に断層上盤が南方へ変位した逆断層運動、D4フェーズはWNW-ESE圧縮の応力場の変形であり鮮新世後期から更新世前期(3-1 Ma) 以降に断層上盤が東方へ変位した右横ずれ運動である。 西南日本を横断する中央構造線沿いには多くの都市が分布しており、MTLの傾斜角度等の幾何学的な情報は、地震災害分布や地震の規模等を予測する上で重要パラメータになると考えられる。また、MTLAFZは地下数km以内の浅い深度でMTLTBに収れんすると考えられ、従来、非活動的な地質断層として考えられていたMTLTBが、将来、活断層として変位する可能性があることを示唆している。
Creators : Miyawaki Masahiro Dissertation Number : 創科博甲第98号 Degree Names : 博士(理学) Date Granted : 2022-09-27 Degree Grantors : Yamaguchi University
Creators : Wu Zhenyuan Dissertation Number : 創科博甲第104号 Degree Names : 博士(理学) Date Granted : 2023-03-03 Degree Grantors : Yamaguchi University
Creators : 星長 翔太 Dissertation Number : 創科博甲第105号 Degree Names : 博士(理学) Date Granted : 2023-03-16 Degree Grantors : Yamaguchi University
Hyperspectral (HS) imaging can capture the detailed spectral signature of each spatial location of a scene and leads to better understanding of different material characteristics than traditional imaging systems. However, existing HS sensors can only provide low spatial resolution images at a video rate in practice. Thus reconstructing high-resolution HS (HR-HS) image via fusing a low-resolution HS (LR-HS) image and a high-resolution RGB (HR-RGB) image with image processing and machine learning technique, called as hyperspectral image super resolution (HSI SR), has attracted a lot of attention. Existing methods for HSI SR are mainly categorized into two research directions: mathematical model based method and deep learning based method. Mathematical model based methods generally formulate the degradation procedure of the observed LR-HS and HR-RGB images with a mathematical model and employ an optimization strategy for solving. Due to the ill-posed essence of the fusion problem, most works leverage the hand-crafted prior to model the underlying structure of the latent HR-HS image, and pursue a more robust solution of the HR-HS image. Recently, deep learning-based approaches have evolved for HS image reconstruction, and current efforts mainly concentrated on designing more complicated and deeper network architectures to pursue better performance. Although impressive reconstruction results can be achieved compared with the mathematical model based methods, the existing deep learning methods have the following three limitations. 1) They are usually implemented in a fully supervised manner, and require a large-scale external dataset including the degraded observations: the LR-HS/HR-RGB images and their corresponding HR-HS ground-truth image, which are difficult to be collected especially in the HSI SR task. 2) They aim to learn a common model from training triplets, and are undoubtedly insufficient to model abundant image priors for various HR-HS images with rich contents, where the spatial structures and spectral characteristics have considerable difference. 3) They generally assume that the spatial and spectral degradation procedures for capturing the LR-HS and HR-RGB images are fixed and known, and then synthesize the training triplets to learn the reconstruction model, which would produce very poor recovering performance for the observations with different degradation procedures. To overcome the above limitations, our research focuses on proposing the unsupervised learning-based framework for HSI SR to learn the specific prior of an under-studying scene without any external dataset. To deal with the observed images captured under different degradation procedures, we further automatically learn the spatial blurring kernel and the camera spectral response function (CSF) related to the specific observations, and incorporate them with the above unsupervised framework to build a high-generalized blind unsupervised HSI SR paradigm. Moreover, Motivated by the fact that the cross-scale pattern recurrence in the natural images may frequently exist, we synthesize the pseudo training triplets from the degraded versions of the LR-HS and HR-RGB observations and themself, and conduct supervised and unsupervised internal learning to obtain a specific model for the HSI SR, dubbed as generalized internal learning. Overall, the main contributions of this dissertation are three-fold and summarized as follows: 1. A deep unsupervised fusion-learning framework for HSI SR is proposed. Inspired by the insights that the convolution neural networks themself possess large amounts of image low-level statistics (priors) and can more easy to generate the image with regular spatial structure and spectral pattern than noisy data, this study proposes an unsupervised framework to automatically generating the target HS image with the LR-HS and HR-RGB observations only without any external training database. Specifically, we explore two paradigms for the HS image generation: 1) learn the HR-HS target using a randomly sampled noise as the input of the generative network from data generation view; 2) reconstructing the target using the fused context of the LR-HS and HR-RGB observations as the input of the generative network from a self-supervised learning view. Both paradigms can automatically model the specific priors of the under-studying scene by optimizing the parameters of the generative network instead of the raw HR-HS target. Concretely, we employ an encoder-decoder architecture to configure our generative network, and generate the target HR-HS image from the noise or the fused context input. We assume that the spatial and spectral degradation procedures for the under-studying LR-HS and HR-RGB observation are known, and then can produce the approximated version of the observations by degrading the generated HR-HS image, which can intuitively used to obtain the reconstruction errors of the observation as the loss function for network training. Our unsupervised learning framework can not only model the specific prior of the under-studying scene to reconstruct a plausible HR-HS estimation without any external dataset but also be easy to be adapted to the observations captured under various imaging conditions, which can be naively realized by changing the degradation operations in our framework. 2. A novel blind learning method for unsupervised HSI SR is proposed. As described in the above deep unsupervised framework for HSI SR that the spatial and spectral degradation procedures are required to be known. However, different optical designs of the HS imaging devices and the RGB camera would cause various degradation processes such as the spatial blurring kernels for capturing LRHS images and the camera spectral response functions (CSF) in the RGB sensors, and it is difficult to get the detailed knowledge for general users. Moreover, the concrete computation in the degradation procedures would be further distorted under various imaging conditions. Then, in real applications, it is hard to have the known degradation knowledge for each under-studying scene. To handle the above issue, this study exploits a novel parallel blind unsupervised approach by automatically and jointly learning the degradation parameters and the generative network. Specifically, according to the unknown components, we propose three ways to solve different problems: 1) a spatial-blind method to automatically learn the spatial blurring kernel in the capture of the LR-HS observation with the known CSF of the RGB sensor; 2) a spectral-blind method to automatically learn the CSF transformation matrix in the capture of the HR-RGB observation with known burring kernel in the HS imaging device; 3) a complete-blind method to simultaneously learn both spatial blurring kernel and CSF matrix. Based on our previously proposed unsupervised framework, we particularly design the special convolution layers for parallelly realizing the spatial and spectral degradation procedures, where the layer parameters are treated as the weights of the blurring kernel and the CSF matrix for being automatically learned. The spatial degradation procedure is implemented by a depthwise convolution layer, where the kernels for different spectral channel are imposed as the same and the stride parameter is set as the expanding scale factor, while the spectral degradation procedure is achieved with a pointwise convolution layer with the output channel 3 to produce the approximated HR-RGB image. With the learnable implementation of the degradation procedure, we construct an end-toend framework to jointly learn the specific prior of the target HR-HS images and the degradation knowledge, and build a high-generalized HSI SR system. Moreover, the proposed framework can be unified for realizing different versions of blind HSI SR by fixing the parameters of the implemented convolution as the known blurring kernel or the CSF, and is highly adapted to arbitrary observation for HSI SR. 3. A generalized internal learning method for HSI SR is proposed. Motivated by the fact that natural images have strong internal data repetition and the crossscale internal recurrence, we further synthesize labeled training triplets using the LR-HS and HR-RGB observation only, and incorporate them with the un-labeled observation as the training data to conduct both supervised and unsupervised learning for constructing a more robust image-specific CNN model of the under-studying HR-HS data. Specifically, we downsample the observed LR-HS and HR-RGB image to their son versions, and produce the training triplets with the LR-HS/HR-RGB sons and the LR-HS observation, where the relation among them would be same as among the LR-HS/HR-RGB observations and the HR-HS target despite of the difference in resolutions. With the synthesized training samples, it is possible to train a image-specific CNN model to achieve the HR-HS target with the observation as input, dubbed as internal learning. However, the synthesized labeled training samples usually have small amounts especially for a large spatial expanding factor, and the further down-sampling on the LR-HS observation would bring severe spectral mixing of the surrounding pixels causing the deviation of the spectral mixing levels at the training phase and test phase. Therefore, these limitations possibly degrade the super-resolved performance with the naive internal learning. To mitigate the above limitations, we incorporate the naive internal learning with our selfsupervised learning method for unsupervised HSI SR, and present a generalized internal learning method to achieve more robust HR-HS image reconstruction.
Creators : LIU ZHE Dissertation Number : 創科博甲第120号 Degree Names : 博士(理学) Date Granted : 2023-09-26 Degree Grantors : Yamaguchi University
Many mail filtering methods have been proposed, but they have not yet achieved perfect filtering. One of the reasons for this is the influence of modified words created by spammers to slip through the mail filtering, in which words are modified by insert symbols, spaces, HTML tags, etc. For example,“ price$ for be$t drug$! ”,“ priceC I A L I S ”, “ <font>se</font>xu<font>al</font> ”, etc. These are frequently replaced with new strings by changing the combination of symbols ,HTML tags etc. Mail filtering is a technique that captures trends in words in training mails (mails received in the past) and applies these trends to words in test mails (newly received emails). Some of the above modified words appear in both training and test mails, i.e., words that could be used as features of spam mail by using them unprocessed, while others appear only in test mails, i.e., words that have not been learned and require special processing (e.g., removal of symbols, search for similar words, etc.) for their use. However, existing methods do not make these distinctions and treat them in the same way. Therefore, in order to bring the filtering performance of the existing methods closer to perfect filtering, we developed a method in which the above modified words are separated into words that appear in both training and test mails and words that appear only in test mails, and each of these words is used for mail filtering. In this study, we treat the above modified words as ”strange words”. Typical examples of such strange words include, in addition to the above, new words included in ham mails, proper nouns used in close relationships, and abbreviations. The results of this study are as follows (1) In order to compare the filtering performance between strange words and other words, filtering experiments were conducted using existing methods with strange words, nouns, verbs, and adjectives. The results showed that the filtering performance of the strange words was the best. This means that strange words have a significant impact on the filtering performance, and we expect to improve the filtering performance of existing methods by developing a new method to utilize strange words. (2) In order to examine the breakdown of strange words, we counted the number of words that appeared in both training and test mails, and the number of words that appeared only in test mails. The results were compared with those obtained for nouns, verbs and adjectives. We found that there are a significant number of strange words that appear in both training and test mails, but only in one of the groups, i.e., ham or spam mail. Words with this appearance pattern are most useful for mail filtering. On the other hand, we found that there are many strange words that appear only in test mails, i.e., words that cannot be learned. We expect to improve the filtering performance by separating these strange words and developing a new method to use each of them. (3) For the use of strange words, we developed (A) a method for using words that appear in both training and test mails, and (B) a method for using words that appear only in test mails, respectively. (A) To examine the breakdown of strange words that appear in both training and test mails, we divided them into two categories: words that appear only in ham and spam mails, i.e., words with patterns that improve filtering performance, and words that do not, and examined their frequency of occurrence. The results showed that the words with appearance patterns that improve filtering performance tend to appear more frequently than those without such patterns. This means that by using words with a certain number of occurrences in filtering, it is possible to use more words that improve filtering performance. We developed a method to do this and conducted experiments with different threshold values to find the optimal value, and confirmed that setting the threshold around 7 improves filtering performance. (B) We compared the number of strange words that appear only in the test mails between ham and spam mails, and found that the number tends to be higher in spam mail than in ham mail. In order to utilize this difference for filtering, we proposed a method to set a uniform spam probability for strange words that appear only in the test mails, and attempted to find the optimal spam probability. As a result, setting the spam probability to 0.7 improved the filtering accuracy from 98.2% to 98.9%. By using (A) and (B) above together, both words that appear in both training and test mails and words that appear only in test mails can be used for mail filtering to increase accuracy. Mail filtering has been improved and its performance has reached its limit. In order to further improve accuracy, i.e., to approach perfect filtering, a new perspective is needed, and this paper provides one such perspective: the use of strange words. This paper is organized as follows. In Chapter 1, we review the background of mail filtering methods, discuss how spammers use strange words to slip through such filters. The purpose and structure of this paper are then presented. In Chapter 2, we will discuss related research on examples of filtering methods that have been proposed so far are given. In Chapter 3, we describe the mail datasets, word handling, and strange words used in the this paper. This is followed by an explanation of the ROC curve, which is the measure used to evaluate the filtering performance, and explanation of scatter plots and box-and-whisker plots. In Chapter 4, we compare the filtering performance between strange words and other words, and show that strange words have a significant impact on the filtering performance. Furthermore, based on the results of a breakdown of the number of strange words, we discuss the possibility of improving filtering performance by separating words that appear in both training and test mails from those that appear only in the test mails. We will work on this in the next chapters and report the results. In Chapter 5, we develop a method to use (A) above, i.e., strange words that appear in both training and test mails. From the results of counting the number of words used in the subject and body of each email, we show that the number tends to be smaller for words that degrade the filtering performance. Based on these results, we propose a method that sets a threshold for the number of words used in the subject and body of mails, and uses only those words that exceed the threshold for classification. Experiments are conducted to find the optimal value by varying the threshold, and the effect of this method on performance is reported. In Chapter 6, we develop a method to use (B) above, i.e., strange words that appear only in the test mails. We compare the number of types of these words in ham and spam mails, and show that the number tends to be larger in spam mails, and that this feature can be used as a bias for detecting spam mails. In this paper, we deal with experiments using bsfilter and develop a method to set spam probabilities uniformly for strange words that appear only in the test mails. After searching for the optimal spam probability, we report that a spam probability of 0.7 greatly improves the filtering performance. In Chapter 7, we describes the processing flow combining the methods developed in Chapter 5 and Chapter 6. The paper is then summarized, including future prospects.
Creators : Temma Seiya Dissertation Number : 創科博甲第126号 Degree Names : 博士(理学) Date Granted : 2023-10-11 Degree Grantors : Yamaguchi University